ffc team mailing list archive
-
ffc team
-
Mailing list archive
-
Message #04106
Re: [Bug 769811] [NEW] JIT cache problem with id(form)
On 25/04/11 22:48, Anders Logg wrote:
> On Mon, Apr 25, 2011 at 10:41:58PM +0100, Garth N. Wells wrote:
>>
>>
>> On 25/04/11 22:33, Anders Logg wrote:
>>> On Mon, Apr 25, 2011 at 10:26:18PM +0100, Garth N. Wells wrote:
>>>>
>>>>
>>>> On 25/04/11 22:08, Anders Logg wrote:
>>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote:
>>>>>> On 25/04/11 20:00, Johan Hake wrote:
>>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
>>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
>>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth Wells wrote:
>>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
>>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
>>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
>>>>>>>>>>>>> Commenting out the cache is really not a fix. The problem is within
>>>>>>>>>>>>> dolfin. Isn't there another way to deal with this?
>>>>>>>>>>>>
>>>>>>>>>>>> It is a fix if the cache isn't needed.
>>>>>>>>>>>
>>>>>>>>>>> Sure.
>>>>>>>>>>>
>>>>>>>>>>>>> First: How much penalty are there with a disabled memory cache. Maybe
>>>>>>>>>>>>> the problem isn't that bad?
>>>>>>>>>>>>
>>>>>>>>>>>> I don't get the point of this cache. The way it is now, a form is only
>>>>>>>>>>>> preprocessed if it hasn't already been preprocessed, which seems ok to
>>>>>>>>>>>> me. The old code tried to avoid some preprocessing, but it was highly
>>>>>>>>>>>> dubious and I doubt that it was effective.
>>>>>>>>>>>
>>>>>>>>>>> I think the preprocessing stage actually do take some time. AFAIK the
>>>>>>>>>>> preproces stage essentially do two things. It creates a canonical
>>>>>>>>>>> version of the Form so two Forms that are the same, but constructed at
>>>>>>>>>>> different times are beeing treated equal wrt form generation. Then are
>>>>>>>>>>> DOLFIN specific guys extracted. I am not sure what takes the most
>>>>>>>>>>> time. We should probably profiel it... But if it is the latter we
>>>>>>>>>>> could consider putting another cache in place which is more robust wrt
>>>>>>>>>>> changing DOLFIN objects.
>>>>>>>>>>
>>>>>>>>>> It should be easy to avoid the overhead of preprocessing by keeping the
>>>>>>>>>> object in scope. If the object changes, the only robust way to make sure
>>>>>>>>>> that the form is the same as one in the cache is to compare all the
>>>>>>>>>> data. This requires preprocessing the form, which then defeats the
>>>>>>>>>> purpose of a cache. It may be possible to add a lightweight preprocess
>>>>>>>>>> to UFL, but I don't think that it's worth the effort or extra
>>>>>>>>>> complication.
>>>>>>>
>>>>>>> I think a light weight version might be the way to go. This is then stored in
>>>>>>> memory cache. If we are able to strip such a form for all DOLFIN specific
>>>>>>> things we would also prevent huge memory leaks with mesh beeing kept.
>>>>>>>
>>>>>>> Then we always grab DOLFIN specific data from the passed form instead of
>>>>>>> grabbing from the cache. Not sure how easy this will be to implement, but I
>>>>>>> think we need to explore it, as the DOLFIN specific part of the form really
>>>>>>> has nothing to do with the generated Form.
>>>>>>>
>>>>>>> Martin:
>>>>>>> Why is it important to have the _count in the repr of the form? I guess that
>>>>>>> is used in ufl algorithms? Would it be possible to include a second repr
>>>>>>> function, which did not include the count? This would then be used when the
>>>>>>> signature is checked for. We could then use that repr to generate a form which
>>>>>>> is stored in the memory cache. This would then be tripped for any DOLFIN
>>>>>>> specific objects. This should work as the _count attribute has nothing to do
>>>>>>> with what code gets generated, but it is essential for internal UFL
>>>>>>> algorithms, right?
>>>>>>>
>>>>>>>>> I'm not very happy with this change.
>>>>>>>>
>>>>>>>> The bright side is that slow and correct is a better starting point than
>>>>>>>> fast but wrong ;).
>>>>>>>>
>>>>>>>> An easy fix is to attach the preprocessed form to a Form object. This
>>>>>>>> would work robustly if we can make forms immutable once they've been
>>>>>>>> compiled. Is it possible to make a Python object immutable?
>>>>>>>
>>>>>>> We can probably overload all setattribtue methods which prohibits a user to
>>>>>>> write to these but it might not be possible to prohibit a user to change
>>>>>>> attributes on instances owned by the Form. I guess this is similare to the
>>>>>>> difficulties of preserving constness in C++, but I think it is even harder in
>>>>>>> Python.
>>>>>>>
>>>>>>
>>>>>> What if we have the FFC jit compiler return the preprocessed form, and
>>>>>> inside dolfin.Form simply do
>>>>>>
>>>>>> class Form(cpp.Form):
>>>>>> def __init__(self, form, . . .. )
>>>>>> ....
>>>>>>
>>>>>> (...., preprocessed_form) = jit(form, . . . . )
>>>>>>
>>>>>> form = preprocessed_form
>>>>>>
>>>>>> .....
>>>>>>
>>>>>> This way, form will have form_data, and the FFC jit function will know
>>>>>> not to call ufl.preprocess.
>>>>>
>>>>> Here's another strange thing. In the JITObject class, we have two
>>>>> functions: __hash__ and signature. As far as I understand, the first
>>>>> is used to located objects (generated code/modules) in the Instant
>>>>> in-memory cache, while the second is used for the on-disk cache.
>>>>>
>>>>> >From some simple tests I did now, it looks like the __hash__ function
>>>>> does not need to any significant speedup. The JIT benchmark runs just
>>>>> as fast if I call signature from within __hash__.
>>>>>
>>>>> Furthermore, the __hash__ function must also be broken since it relies
>>>>> on calling id on the form.
>>>>>
>>>>> Ideally, we should get Instant to handle the caching, both in-memory
>>>>> and on-disk, by providing two functions __hash__ (fast, for in-memory
>>>>> cache) and signature (slow, for on-disk cache).
>>>>>
>>>>> Since __hash__ cannot call id, it must be able to attach a unique
>>>>> string to the form (perhaps based on an internal counter in FFC).
>>>>> My suggestion would be to add this to UFL, something like set_hash
>>>>> and hash (which would return None if set_hash has not been called).
>>>>> If Martin does not like that, we should be able to handle it on the
>>>>> DOLFIN side.
>>>>>
>>>>> So in conclusion: no in-memory cache in FFC (handled by Instant) and
>>>>> FFC attaches a hash to incoming forms so that Instant may recognize
>>>>> them later.
>>>>>
>>>>
>>>> The code that I disabled was caching preprocessed forms, so I don't see
>>>> how this can be handled by Instant.
>>>
>>> The point would be that one could check that "hash" of the form (some
>>> unique string) instead of computing the signature which involves
>>> preprocessing the form.
>>>
>>
>> How would the hash be computed? To check if the mesh has changed, my
>> limited understanding is that the entire object would have to be
>> serialised, and then a hash computed. How expensive is that?
>>
>> The issue that I ran into was not related to signatures. It was related
>> to the non-UFL data that is attached to arguments.
>
> The hash would be unique to each form. It could just be a counter
> value and the counter would be increased inside Instant for each
> object it gets as input.
But how does Instant know if a form is new? I also don't see why Instant
should need to know if the mesh associated with a form has changed, but
is for the rest the same. Wouldn't Instant need to be DOLFIN-aware?
Garth
>
> We could also include the output of repr if we want, but as Martin
> points out we should not modify forms so then it shouldn't be needed.
> (See comment about why repr is included in jitcompiler.py.)
>
> --
> Anders
Follow ups
References