← Back to team overview

ffc team mailing list archive

Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)

 

On Tuesday April 26 2011 01:01:55 Martin Sandve Alnæs wrote:
> I'm not sure if this is safe or even solves anything.
> If there are circular references a.b = b; b.a = a,
> a.__del__ won't be called if the reference from
> b.a is still intact, since a.__del__ is called when
> the reference count of a goes to 0. Adding the
> __del__ function will also stop the cyclic reference
> detection in Python from cleaning up. That is my
> understanding after reading
> http://docs.python.org/reference/datamodel.html
> just now. Correct me if I'm wrong!

If the above situation was correct you would be right. As it is now a 
preprocessed form keeps a reference to form_data and form_data keeps a 
reference to the preprocessed form => Ciruclar dependency.

I tried to break that by deleting the preprocessed form from its form_data, 
when the original form is deleted (not the preprocessed one.) When this 
happens the preprocessed form will be deleted (I guess).

Not sure why the preprocessed form need to keep a reference to the form_data, 
though. Removing that reference would solve the circular dependency.

> I don't know which memory cache you're referring to here.
> Instant has a memory cache to avoid unnecessary disk access.

The memory cache refered to here was the one I removed. It was a memory cache 
which cached the preprocessed form, using the id() of the original form. I 
think this was the initial issue, as the id of a form turned out to be not 
safe. But then it also turned out to be superflous as the preprocessed form 
were attached to the form_data anyhow. So the issue is essentially solved. We 
just need to get rid of the ciruclar dependency. Which I now think we can do 
by removing the reference to the preprocessed form from form_data.

> And this patch does not seem related to any of
> "the issues" I've seen mentioned in this thread.
> Cyclic Python object references causing memory
> leaks sounds like a potential important issue, but
> that's not what this thread is about is it?

No. It was just an issue I stumbled across while trying to fix the really 
issue. Sorry for not being clear on that.

Johan

> Martin
> 
> On 25 April 2011 23:50, Johan Hake <johan.hake@xxxxxxxxx> wrote:
> > This should be fixed now.
> > 
> > I do not see why we introduced the memory cache when this solution was
> > laying
> > right in front our eyes...
> > 
> > Anyhow. Here is a patch for ufl to avoid circular dependency between a
> > preprocessed form and the form_data.
> > 
> > Johan
> > 
> > On Monday April 25 2011 14:34:00 Anders Logg wrote:
> > > Simple sounds good.
> > > 
> > > --
> > > Anders
> > > 
> > > On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake wrote:
> > > > I am working on a simple solution, where we store everything in the
> > > > original ufl form.
> > > > 
> > > > I might have something soon.
> > > > 
> > > > Johan
> > > > 
> > > > On Monday April 25 2011 14:26:18 Garth N. Wells wrote:
> > > > > On 25/04/11 22:08, Anders Logg wrote:
> > > > > > On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote:
> > > > > >> On 25/04/11 20:00, Johan Hake wrote:
> > > > > >>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
> > > > > >>>> On 25/04/11 18:51, Anders Logg wrote:
> > > > > >>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth Wells wrote:
> > > > > >>>>>> On 25/04/11 17:53, Johan Hake wrote:
> > > > > >>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
> > > > > >>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
> > > > > >>>>>>>>> Commenting out the cache is really not a fix. The problem
> > 
> > is
> > 
> > > > > >>>>>>>>> within dolfin. Isn't there another way to deal with this?
> > > > > >>>>>>>> 
> > > > > >>>>>>>> It is a fix if the cache isn't needed.
> > > > > >>>>>>> 
> > > > > >>>>>>> Sure.
> > > > > >>>>>>> 
> > > > > >>>>>>>>> First: How much penalty are there with a disabled memory
> > > > > >>>>>>>>> cache. Maybe the problem isn't that bad?
> > > > > >>>>>>>> 
> > > > > >>>>>>>> I don't get the point of this cache. The way it is now, a
> > 
> > form
> > 
> > > > > >>>>>>>> is only preprocessed if it hasn't already been
> > > > > >>>>>>>> preprocessed, which seems ok to me. The old code tried to
> > > > > >>>>>>>> avoid some preprocessing, but it was highly dubious and I
> > > > > >>>>>>>> doubt that it was effective.
> > > > > >>>>>>> 
> > > > > >>>>>>> I think the preprocessing stage actually do take some time.
> > > > > >>>>>>> AFAIK the preproces stage essentially do two things. It
> > > > > >>>>>>> creates a canonical version of the Form so two Forms that
> > > > > >>>>>>> are the same, but constructed at different times are
> > > > > >>>>>>> beeing treated equal wrt form generation. Then are DOLFIN
> > > > > >>>>>>> specific guys extracted. I am not sure what takes the most
> > > > > >>>>>>> time. We should probably profiel it... But if it is the
> > > > > >>>>>>> latter we
> > 
> > could
> > 
> > > > > >>>>>>> consider putting another cache in place which is more
> > > > > >>>>>>> robust wrt changing DOLFIN objects.
> > > > > >>>>>> 
> > > > > >>>>>> It should be easy to avoid the overhead of preprocessing by
> > > > > >>>>>> keeping the object in scope. If the object changes, the only
> > > > > >>>>>> robust way to make sure that the form is the same as one in
> > 
> > the
> > 
> > > > > >>>>>> cache is to compare all the data. This requires
> > > > > >>>>>> preprocessing the form, which then defeats the purpose of a
> > > > > >>>>>> cache. It may be possible to add a lightweight preprocess
> > > > > >>>>>> to UFL, but I don't think that it's worth the effort or
> > > > > >>>>>> extra complication.
> > > > > >>> 
> > > > > >>> I think a light weight version might be the way to go. This is
> > 
> > then
> > 
> > > > > >>> stored in memory cache. If we are able to strip such a form for
> > 
> > all
> > 
> > > > > >>> DOLFIN specific things we would also prevent huge memory leaks
> > 
> > with
> > 
> > > > > >>> mesh beeing kept.
> > > > > >>> 
> > > > > >>> Then we always grab DOLFIN specific data from the passed form
> > > > > >>> instead of grabbing from the cache. Not sure how easy this will
> > 
> > be
> > 
> > > > > >>> to implement, but I think we need to explore it, as the DOLFIN
> > > > > >>> specific part of the form really has nothing to do with the
> > > > > >>> generated Form.
> > > > > >>> 
> > > > > >>> Martin:
> > > > > >>> Why is it important to have the _count in the repr of the form?
> > > > > >>> I guess that is used in ufl algorithms? Would it be possible
> > > > > >>> to include a second repr function, which did not include the
> > > > > >>> count? This would then be used when the signature is checked
> > > > > >>> for. We could then use that repr to generate a form which is
> > > > > >>> stored in
> > 
> > the
> > 
> > > > > >>> memory cache. This would then be tripped for any DOLFIN
> > > > > >>> specific objects. This should work as the _count attribute has
> > > > > >>> nothing to do with what code gets generated, but it is
> > > > > >>> essential for
> > 
> > internal
> > 
> > > > > >>> UFL algorithms, right?
> > > > > >>> 
> > > > > >>>>> I'm not very happy with this change.
> > > > > >>>> 
> > > > > >>>> The bright side is that slow and correct is a better starting
> > > > > >>>> point than fast but wrong ;).
> > > > > >>>> 
> > > > > >>>> An easy fix is to attach the preprocessed form to a Form
> > > > > >>>> object. This would work robustly if we can make forms
> > > > > >>>> immutable once they've been compiled. Is it possible to make
> > > > > >>>> a Python object immutable?
> > > > > >>> 
> > > > > >>> We can probably overload all setattribtue methods which
> > > > > >>> prohibits
> > 
> > a
> > 
> > > > > >>> user to write to these but it might not be possible to prohibit
> > > > > >>> a user to change attributes on instances owned by the Form. I
> > > > > >>> guess this is similare to the difficulties of preserving
> > > > > >>> constness in C++, but I think it is even harder in Python.
> > > > > >> 
> > > > > >> What if we have the FFC jit compiler return the preprocessed
> > > > > >> form, and inside dolfin.Form simply do
> > > > > >> 
> > > > > >>     class Form(cpp.Form):
> > > > > >>         def __init__(self, form, . . .. )
> > > > > >>         ....
> > > > > >>         
> > > > > >>         (...., preprocessed_form) = jit(form, . . . . )
> > > > > >>         
> > > > > >>         form = preprocessed_form
> > > > > >>         
> > > > > >>         .....
> > > > > >> 
> > > > > >> This way, form will have form_data, and the FFC jit function
> > > > > >> will know not to call ufl.preprocess.
> > > > > > 
> > > > > > Here's another strange thing. In the JITObject class, we have two
> > > > > > functions: __hash__ and signature. As far as I understand, the
> > 
> > first
> > 
> > > > > > is used to located objects (generated code/modules) in the
> > > > > > Instant in-memory cache, while the second is used for the
> > > > > > on-disk cache.
> > > > > > 
> > > > > >>From some simple tests I did now, it looks like the __hash__
> > 
> > function
> > 
> > > > > > does not need to any significant speedup. The JIT benchmark runs
> > 
> > just
> > 
> > > > > > as fast if I call signature from within __hash__.
> > > > > > 
> > > > > > Furthermore, the __hash__ function must also be broken since it
> > > > > > relies on calling id on the form.
> > > > > > 
> > > > > > Ideally, we should get Instant to handle the caching, both
> > 
> > in-memory
> > 
> > > > > > and on-disk, by providing two functions __hash__ (fast, for
> > 
> > in-memory
> > 
> > > > > > cache) and signature (slow, for on-disk cache).
> > > > > > 
> > > > > > Since __hash__ cannot call id, it must be able to attach a unique
> > > > > > string to the form (perhaps based on an internal counter in FFC).
> > > > > > My suggestion would be to add this to UFL, something like
> > > > > > set_hash and hash (which would return None if set_hash has not
> > > > > > been called). If Martin does not like that, we should be able to
> > > > > > handle it on the DOLFIN side.
> > > > > > 
> > > > > > So in conclusion: no in-memory cache in FFC (handled by Instant)
> > 
> > and
> > 
> > > > > > FFC attaches a hash to incoming forms so that Instant may
> > > > > > recognize them later.
> > > > > 
> > > > > The code that I disabled was caching preprocessed forms, so I don't
> > 
> > see
> > 
> > > > > how this can be handled by Instant.
> > > > > 
> > > > > Garth
> > > > > 
> > > > > > Maybe even better: Instant checks whether an incoming object has
> > > > > > a set_hash function and if so calls it so it can recognize
> > > > > > objects it sees a second time.
> > > > > > 
> > > > > > I'm moving this discussion to the mailing list(s).
> > > > > 
> > > > > _______________________________________________
> > > > > Mailing list: https://launchpad.net/~ufl
> > > > > Post to     : ufl@xxxxxxxxxxxxxxxxxxx
> > > > > Unsubscribe : https://launchpad.net/~ufl
> > > > > More help   : https://help.launchpad.net/ListHelp
> > 
> > _______________________________________________
> > Mailing list: https://launchpad.net/~ufl
> > Post to     : ufl@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~ufl
> > More help   : https://help.launchpad.net/ListHelp



Follow ups

References