← Back to team overview

ffc team mailing list archive

Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)

 

On 26 April 2011 11:09, Garth N. Wells <gnw20@xxxxxxxxx> wrote:

>
>
> On 26/04/11 09:56, Garth N. Wells wrote:
> >
> >
> > On 26/04/11 09:03, Martin Sandve Alnæs wrote:
> >> See other mail. I don't see that it solves anything, it doesn't seem
> >> related to anything I've read about in this thread, and it has a
> >> potential backside in hindering the garbage collector in Python. I may
> >> be wrong, but nobody has answered my other questions about this thread
> yet.
> >>
> >
> > As a precursor, the primary problem has nothing to do with Instant disk
> > cache, etc. The Instant discussion is just confusing the original point.
> >
> > In summary, is it helpful if DOLFIN can avoid calling ufl.preprocess
> > every time a dolfin.Form object is created. DOLFIN relies on
> > preprocessing to extract the form Arguments, from which the mesh is
> > extracted (via form_data().original_arguments, and since DOLFIN uses
> > 'Arguments' that are subclasses of UFL and DOLFIN objects).
> >
> > The solution that Johan has implemented is to have FFC attach the
> > form_data to a form. If a form has form_data attached, then we know that
> > it has already been preprocessed. Martin won't like this because it's
> > changing the form object.
> >
> > It may be enough if UFL would provide a function to return a list of
> > form Arguments, if this is fast. Something like
> >
> >   def extract_original_arguments(form):
> >
> >       # Replace arguments and coefficients with new renumbered objects
> >       arguments, coefficients = extract_arguments_and_coefficients(form)
> >       replace_map, arguments, coefficients \
> >             = build_argument_replace_map(arguments, coefficients)
> >       form = replace(form, replace_map)
> >
> >       # Build mapping to original arguments and coefficients, which is
> >       # useful if the original arguments have data attached to them
> >       inv_replace_map = {}
> >       for v, w in replace_map.iteritems():
> >           inv_replace_map[w] = v
> >       original_arguments = [inv_replace_map[v] for v in arguments]
> >
> >       return original_arguments
> >
>
> As addition, I think that we're letting DOLFIN specific issues creep
> into FFC and UFL. It would be simple if FFC simply did
>
>    if form.form_data() is not None:
>        preprocessed_form = form
>    else:
>        preprocessed_form = preprocess(form, common_cell=common_cell)
>
>
Is this ufl.preprocess or ffc.preprocess?


> and DOLFIN is made responsible for preprocessing a form (or not
> preprocessing) before sending it to the FFC JIT compiler, particularly
> since deciding to preprocess or not can depend on what DOLFIN-specific
> data (e.g. meshes) has been attached to the form.


Here I'm lost. Why is preprocessing dependent on dolfin anything?
What kind of preprocessing are we talking about here?

Martin



> Garth
>
>
>
>
>
> > Garth
> >
> >
> >> Martin
> >>
> >> On 26 April 2011 09:20, Garth N. Wells <gnw20@xxxxxxxxx
> >> <mailto:gnw20@xxxxxxxxx>> wrote:
> >>
> >>     Martin: Any problem if we apply this patch to UFL?
> >>
> >>     Garth
> >>
> >>     On 25/04/11 22:50, Johan Hake wrote:
> >>     > This should be fixed now.
> >>     >
> >>     > I do not see why we introduced the memory cache when this solution
> >>     was laying
> >>     > right in front our eyes...
> >>     >
> >>     > Anyhow. Here is a patch for ufl to avoid circular dependency
> between a
> >>     > preprocessed form and the form_data.
> >>     >
> >>     > Johan
> >>     >
> >>     > On Monday April 25 2011 14:34:00 Anders Logg wrote:
> >>     >> Simple sounds good.
> >>     >>
> >>     >> --
> >>     >> Anders
> >>     >>
> >>     >> On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake wrote:
> >>     >>> I am working on a simple solution, where we store everything in
> the
> >>     >>> original ufl form.
> >>     >>>
> >>     >>> I might have something soon.
> >>     >>>
> >>     >>> Johan
> >>     >>>
> >>     >>> On Monday April 25 2011 14:26:18 Garth N. Wells wrote:
> >>     >>>> On 25/04/11 22:08, Anders Logg wrote:
> >>     >>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote:
> >>     >>>>>> On 25/04/11 20:00, Johan Hake wrote:
> >>     >>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
> >>     >>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
> >>     >>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth Wells
> wrote:
> >>     >>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
> >>     >>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
> >>     >>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
> >>     >>>>>>>>>>>>> Commenting out the cache is really not a fix. The
> >>     problem is
> >>     >>>>>>>>>>>>> within dolfin. Isn't there another way to deal with
> this?
> >>     >>>>>>>>>>>>
> >>     >>>>>>>>>>>> It is a fix if the cache isn't needed.
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>> Sure.
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>>>> First: How much penalty are there with a disabled
> memory
> >>     >>>>>>>>>>>>> cache. Maybe the problem isn't that bad?
> >>     >>>>>>>>>>>>
> >>     >>>>>>>>>>>> I don't get the point of this cache. The way it is now,
> >>     a form
> >>     >>>>>>>>>>>> is only preprocessed if it hasn't already been
> >>     preprocessed,
> >>     >>>>>>>>>>>> which seems ok to me. The old code tried to avoid some
> >>     >>>>>>>>>>>> preprocessing, but it was highly dubious and I doubt
> >>     that it
> >>     >>>>>>>>>>>> was effective.
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>> I think the preprocessing stage actually do take some
> time.
> >>     >>>>>>>>>>> AFAIK the preproces stage essentially do two things. It
> >>     >>>>>>>>>>> creates a canonical version of the Form so two Forms
> >>     that are
> >>     >>>>>>>>>>> the same, but constructed at different times are beeing
> >>     >>>>>>>>>>> treated equal wrt form generation. Then are DOLFIN
> specific
> >>     >>>>>>>>>>> guys extracted. I am not sure what takes the most time.
> We
> >>     >>>>>>>>>>> should probably profiel it... But if it is the latter we
> >>     could
> >>     >>>>>>>>>>> consider putting another cache in place which is more
> robust
> >>     >>>>>>>>>>> wrt changing DOLFIN objects.
> >>     >>>>>>>>>>
> >>     >>>>>>>>>> It should be easy to avoid the overhead of preprocessing
> by
> >>     >>>>>>>>>> keeping the object in scope. If the object changes, the
> only
> >>     >>>>>>>>>> robust way to make sure that the form is the same as one
> >>     in the
> >>     >>>>>>>>>> cache is to compare all the data. This requires
> preprocessing
> >>     >>>>>>>>>> the form, which then defeats the purpose of a cache. It
> >>     may be
> >>     >>>>>>>>>> possible to add a lightweight preprocess to UFL, but I
> don't
> >>     >>>>>>>>>> think that it's worth the effort or extra complication.
> >>     >>>>>>>
> >>     >>>>>>> I think a light weight version might be the way to go. This
> >>     is then
> >>     >>>>>>> stored in memory cache. If we are able to strip such a form
> >>     for all
> >>     >>>>>>> DOLFIN specific things we would also prevent huge memory
> >>     leaks with
> >>     >>>>>>> mesh beeing kept.
> >>     >>>>>>>
> >>     >>>>>>> Then we always grab DOLFIN specific data from the passed
> form
> >>     >>>>>>> instead of grabbing from the cache. Not sure how easy this
> >>     will be
> >>     >>>>>>> to implement, but I think we need to explore it, as the
> DOLFIN
> >>     >>>>>>> specific part of the form really has nothing to do with the
> >>     >>>>>>> generated Form.
> >>     >>>>>>>
> >>     >>>>>>> Martin:
> >>     >>>>>>> Why is it important to have the _count in the repr of the
> >>     form? I
> >>     >>>>>>> guess that is used in ufl algorithms? Would it be possible
> to
> >>     >>>>>>> include a second repr function, which did not include the
> count?
> >>     >>>>>>> This would then be used when the signature is checked for.
> We
> >>     >>>>>>> could then use that repr to generate a form which is stored
> >>     in the
> >>     >>>>>>> memory cache. This would then be tripped for any DOLFIN
> specific
> >>     >>>>>>> objects. This should work as the _count attribute has
> nothing to
> >>     >>>>>>> do with what code gets generated, but it is essential for
> >>     internal
> >>     >>>>>>> UFL algorithms, right?
> >>     >>>>>>>
> >>     >>>>>>>>> I'm not very happy with this change.
> >>     >>>>>>>>
> >>     >>>>>>>> The bright side is that slow and correct is a better
> starting
> >>     >>>>>>>> point than fast but wrong ;).
> >>     >>>>>>>>
> >>     >>>>>>>> An easy fix is to attach the preprocessed form to a Form
> >>     object.
> >>     >>>>>>>> This would work robustly if we can make forms immutable
> once
> >>     >>>>>>>> they've been compiled. Is it possible to make a Python
> object
> >>     >>>>>>>> immutable?
> >>     >>>>>>>
> >>     >>>>>>> We can probably overload all setattribtue methods which
> >>     prohibits a
> >>     >>>>>>> user to write to these but it might not be possible to
> >>     prohibit a
> >>     >>>>>>> user to change attributes on instances owned by the Form. I
> >>     guess
> >>     >>>>>>> this is similare to the difficulties of preserving constness
> in
> >>     >>>>>>> C++, but I think it is even harder in Python.
> >>     >>>>>>
> >>     >>>>>> What if we have the FFC jit compiler return the preprocessed
> >>     form,
> >>     >>>>>> and inside dolfin.Form simply do
> >>     >>>>>>
> >>     >>>>>>     class Form(cpp.Form):
> >>     >>>>>>         def __init__(self, form, . . .. )
> >>     >>>>>>         ....
> >>     >>>>>>
> >>     >>>>>>         (...., preprocessed_form) = jit(form, . . . . )
> >>     >>>>>>
> >>     >>>>>>         form = preprocessed_form
> >>     >>>>>>
> >>     >>>>>>         .....
> >>     >>>>>>
> >>     >>>>>> This way, form will have form_data, and the FFC jit function
> will
> >>     >>>>>> know not to call ufl.preprocess.
> >>     >>>>>
> >>     >>>>> Here's another strange thing. In the JITObject class, we have
> two
> >>     >>>>> functions: __hash__ and signature. As far as I understand, the
> >>     first
> >>     >>>>> is used to located objects (generated code/modules) in the
> Instant
> >>     >>>>> in-memory cache, while the second is used for the on-disk
> cache.
> >>     >>>>>
> >>     >>>>> >From some simple tests I did now, it looks like the __hash__
> >>     function
> >>     >>>>>>
> >>     >>>>> does not need to any significant speedup. The JIT benchmark
> >>     runs just
> >>     >>>>> as fast if I call signature from within __hash__.
> >>     >>>>>
> >>     >>>>> Furthermore, the __hash__ function must also be broken since
> it
> >>     >>>>> relies on calling id on the form.
> >>     >>>>>
> >>     >>>>> Ideally, we should get Instant to handle the caching, both
> >>     in-memory
> >>     >>>>> and on-disk, by providing two functions __hash__ (fast, for
> >>     in-memory
> >>     >>>>> cache) and signature (slow, for on-disk cache).
> >>     >>>>>
> >>     >>>>> Since __hash__ cannot call id, it must be able to attach a
> unique
> >>     >>>>> string to the form (perhaps based on an internal counter in
> FFC).
> >>     >>>>> My suggestion would be to add this to UFL, something like
> set_hash
> >>     >>>>> and hash (which would return None if set_hash has not been
> >>     called).
> >>     >>>>> If Martin does not like that, we should be able to handle it
> >>     on the
> >>     >>>>> DOLFIN side.
> >>     >>>>>
> >>     >>>>> So in conclusion: no in-memory cache in FFC (handled by
> >>     Instant) and
> >>     >>>>> FFC attaches a hash to incoming forms so that Instant may
> >>     recognize
> >>     >>>>> them later.
> >>     >>>>
> >>     >>>> The code that I disabled was caching preprocessed forms, so I
> >>     don't see
> >>     >>>> how this can be handled by Instant.
> >>     >>>>
> >>     >>>> Garth
> >>     >>>>
> >>     >>>>> Maybe even better: Instant checks whether an incoming object
> has a
> >>     >>>>> set_hash function and if so calls it so it can recognize
> >>     objects it
> >>     >>>>> sees a second time.
> >>     >>>>>
> >>     >>>>> I'm moving this discussion to the mailing list(s).
> >>     >>>>
> >>     >>>> _______________________________________________
> >>     >>>> Mailing list: https://launchpad.net/~ufl
> >>     >>>> Post to     : ufl@xxxxxxxxxxxxxxxxxxx
> >>     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
> >>     >>>> Unsubscribe : https://launchpad.net/~ufl
> >>     >>>> More help   : https://help.launchpad.net/ListHelp
> >>
> >>     _______________________________________________
> >>     Mailing list: https://launchpad.net/~ufl
> >>     Post to     : ufl@xxxxxxxxxxxxxxxxxxx <mailto:
> ufl@xxxxxxxxxxxxxxxxxxx>
> >>     Unsubscribe : https://launchpad.net/~ufl
> >>     More help   : https://help.launchpad.net/ListHelp
> >>
> >>
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~ufl
> > Post to     : ufl@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~ufl
> > More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References