← Back to team overview

ffc team mailing list archive

Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)

 

On 26 April 2011 10:56, Garth N. Wells <gnw20@xxxxxxxxx> wrote:

>
>
> On 26/04/11 09:03, Martin Sandve Alnæs wrote:
> > See other mail. I don't see that it solves anything, it doesn't seem
> > related to anything I've read about in this thread, and it has a
> > potential backside in hindering the garbage collector in Python. I may
> > be wrong, but nobody has answered my other questions about this thread
> yet.
> >
>
> As a precursor, the primary problem has nothing to do with Instant disk
> cache, etc. The Instant discussion is just confusing the original point.
>
> In summary, is it helpful if DOLFIN can avoid calling ufl.preprocess
> every time a dolfin.Form object is created. DOLFIN relies on
> preprocessing to extract the form Arguments, from which the mesh is
> extracted (via form_data().original_arguments, and since DOLFIN uses
> 'Arguments' that are subclasses of UFL and DOLFIN objects).
>
> The solution that Johan has implemented is to have FFC attach the
> form_data to a form. If a form has form_data attached, then we know that
> it has already been preprocessed. Martin won't like this because it's
> changing the form object.
>

This sounds much like my original design. Trying to recall from my possibly
rusty memory, I believe that calling myform.form_data() would
construct form data only the first time and the preprocessed form could
be retrieved from the returned form data. The form data was attached
as myform._form_data. Thus you could always say
preprocessed_form = myform.form_data().form
and preprocessing would only happen once. This was redesigned
after I left to have a separate preprocess function.


It may be enough if UFL would provide a function to return a list of
> form Arguments, if this is fast. Something like
>
>  def extract_original_arguments(form):
>
>      # Replace arguments and coefficients with new renumbered objects
>      arguments, coefficients = extract_arguments_and_coefficients(form)
>      replace_map, arguments, coefficients \
>            = build_argument_replace_map(arguments, coefficients)
>      form = replace(form, replace_map)
>
>      # Build mapping to original arguments and coefficients, which is
>      # useful if the original arguments have data attached to them
>      inv_replace_map = {}
>      for v, w in replace_map.iteritems():
>          inv_replace_map[w] = v
>      original_arguments = [inv_replace_map[v] for v in arguments]
>
>      return original_arguments
>
> Garth


I don't understand why this is needed. We:
- must preprocess each form once
- don't want to preprocess the same form twice
- can obtain the original arguments after preprocessing
This was supported a long time ago, so unless someone has
removed functionality while I've been gone, what is the problem?

I have a feeling that the source of many problems is the attempt
to reuse forms and change mesh, functions, or elements.
This is contrary to the design of UFL where expressions are immutable.

Martin


> Martin
> >
> > On 26 April 2011 09:20, Garth N. Wells <gnw20@xxxxxxxxx
> > <mailto:gnw20@xxxxxxxxx>> wrote:
> >
> >     Martin: Any problem if we apply this patch to UFL?
> >
> >     Garth
> >
> >     On 25/04/11 22:50, Johan Hake wrote:
> >     > This should be fixed now.
> >     >
> >     > I do not see why we introduced the memory cache when this solution
> >     was laying
> >     > right in front our eyes...
> >     >
> >     > Anyhow. Here is a patch for ufl to avoid circular dependency
> between a
> >     > preprocessed form and the form_data.
> >     >
> >     > Johan
> >     >
> >     > On Monday April 25 2011 14:34:00 Anders Logg wrote:
> >     >> Simple sounds good.
> >     >>
> >     >> --
> >     >> Anders
> >     >>
> >     >> On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake wrote:
> >     >>> I am working on a simple solution, where we store everything in
> the
> >     >>> original ufl form.
> >     >>>
> >     >>> I might have something soon.
> >     >>>
> >     >>> Johan
> >     >>>
> >     >>> On Monday April 25 2011 14:26:18 Garth N. Wells wrote:
> >     >>>> On 25/04/11 22:08, Anders Logg wrote:
> >     >>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote:
> >     >>>>>> On 25/04/11 20:00, Johan Hake wrote:
> >     >>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
> >     >>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
> >     >>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth Wells
> wrote:
> >     >>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
> >     >>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
> >     >>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
> >     >>>>>>>>>>>>> Commenting out the cache is really not a fix. The
> >     problem is
> >     >>>>>>>>>>>>> within dolfin. Isn't there another way to deal with
> this?
> >     >>>>>>>>>>>>
> >     >>>>>>>>>>>> It is a fix if the cache isn't needed.
> >     >>>>>>>>>>>
> >     >>>>>>>>>>> Sure.
> >     >>>>>>>>>>>
> >     >>>>>>>>>>>>> First: How much penalty are there with a disabled
> memory
> >     >>>>>>>>>>>>> cache. Maybe the problem isn't that bad?
> >     >>>>>>>>>>>>
> >     >>>>>>>>>>>> I don't get the point of this cache. The way it is now,
> >     a form
> >     >>>>>>>>>>>> is only preprocessed if it hasn't already been
> >     preprocessed,
> >     >>>>>>>>>>>> which seems ok to me. The old code tried to avoid some
> >     >>>>>>>>>>>> preprocessing, but it was highly dubious and I doubt
> >     that it
> >     >>>>>>>>>>>> was effective.
> >     >>>>>>>>>>>
> >     >>>>>>>>>>> I think the preprocessing stage actually do take some
> time.
> >     >>>>>>>>>>> AFAIK the preproces stage essentially do two things. It
> >     >>>>>>>>>>> creates a canonical version of the Form so two Forms
> >     that are
> >     >>>>>>>>>>> the same, but constructed at different times are beeing
> >     >>>>>>>>>>> treated equal wrt form generation. Then are DOLFIN
> specific
> >     >>>>>>>>>>> guys extracted. I am not sure what takes the most time.
> We
> >     >>>>>>>>>>> should probably profiel it... But if it is the latter we
> >     could
> >     >>>>>>>>>>> consider putting another cache in place which is more
> robust
> >     >>>>>>>>>>> wrt changing DOLFIN objects.
> >     >>>>>>>>>>
> >     >>>>>>>>>> It should be easy to avoid the overhead of preprocessing
> by
> >     >>>>>>>>>> keeping the object in scope. If the object changes, the
> only
> >     >>>>>>>>>> robust way to make sure that the form is the same as one
> >     in the
> >     >>>>>>>>>> cache is to compare all the data. This requires
> preprocessing
> >     >>>>>>>>>> the form, which then defeats the purpose of a cache. It
> >     may be
> >     >>>>>>>>>> possible to add a lightweight preprocess to UFL, but I
> don't
> >     >>>>>>>>>> think that it's worth the effort or extra complication.
> >     >>>>>>>
> >     >>>>>>> I think a light weight version might be the way to go. This
> >     is then
> >     >>>>>>> stored in memory cache. If we are able to strip such a form
> >     for all
> >     >>>>>>> DOLFIN specific things we would also prevent huge memory
> >     leaks with
> >     >>>>>>> mesh beeing kept.
> >     >>>>>>>
> >     >>>>>>> Then we always grab DOLFIN specific data from the passed form
> >     >>>>>>> instead of grabbing from the cache. Not sure how easy this
> >     will be
> >     >>>>>>> to implement, but I think we need to explore it, as the
> DOLFIN
> >     >>>>>>> specific part of the form really has nothing to do with the
> >     >>>>>>> generated Form.
> >     >>>>>>>
> >     >>>>>>> Martin:
> >     >>>>>>> Why is it important to have the _count in the repr of the
> >     form? I
> >     >>>>>>> guess that is used in ufl algorithms? Would it be possible to
> >     >>>>>>> include a second repr function, which did not include the
> count?
> >     >>>>>>> This would then be used when the signature is checked for. We
> >     >>>>>>> could then use that repr to generate a form which is stored
> >     in the
> >     >>>>>>> memory cache. This would then be tripped for any DOLFIN
> specific
> >     >>>>>>> objects. This should work as the _count attribute has nothing
> to
> >     >>>>>>> do with what code gets generated, but it is essential for
> >     internal
> >     >>>>>>> UFL algorithms, right?
> >     >>>>>>>
> >     >>>>>>>>> I'm not very happy with this change.
> >     >>>>>>>>
> >     >>>>>>>> The bright side is that slow and correct is a better
> starting
> >     >>>>>>>> point than fast but wrong ;).
> >     >>>>>>>>
> >     >>>>>>>> An easy fix is to attach the preprocessed form to a Form
> >     object.
> >     >>>>>>>> This would work robustly if we can make forms immutable once
> >     >>>>>>>> they've been compiled. Is it possible to make a Python
> object
> >     >>>>>>>> immutable?
> >     >>>>>>>
> >     >>>>>>> We can probably overload all setattribtue methods which
> >     prohibits a
> >     >>>>>>> user to write to these but it might not be possible to
> >     prohibit a
> >     >>>>>>> user to change attributes on instances owned by the Form. I
> >     guess
> >     >>>>>>> this is similare to the difficulties of preserving constness
> in
> >     >>>>>>> C++, but I think it is even harder in Python.
> >     >>>>>>
> >     >>>>>> What if we have the FFC jit compiler return the preprocessed
> >     form,
> >     >>>>>> and inside dolfin.Form simply do
> >     >>>>>>
> >     >>>>>>     class Form(cpp.Form):
> >     >>>>>>         def __init__(self, form, . . .. )
> >     >>>>>>         ....
> >     >>>>>>
> >     >>>>>>         (...., preprocessed_form) = jit(form, . . . . )
> >     >>>>>>
> >     >>>>>>         form = preprocessed_form
> >     >>>>>>
> >     >>>>>>         .....
> >     >>>>>>
> >     >>>>>> This way, form will have form_data, and the FFC jit function
> will
> >     >>>>>> know not to call ufl.preprocess.
> >     >>>>>
> >     >>>>> Here's another strange thing. In the JITObject class, we have
> two
> >     >>>>> functions: __hash__ and signature. As far as I understand, the
> >     first
> >     >>>>> is used to located objects (generated code/modules) in the
> Instant
> >     >>>>> in-memory cache, while the second is used for the on-disk
> cache.
> >     >>>>>
> >     >>>>> >From some simple tests I did now, it looks like the __hash__
> >     function
> >     >>>>>>
> >     >>>>> does not need to any significant speedup. The JIT benchmark
> >     runs just
> >     >>>>> as fast if I call signature from within __hash__.
> >     >>>>>
> >     >>>>> Furthermore, the __hash__ function must also be broken since it
> >     >>>>> relies on calling id on the form.
> >     >>>>>
> >     >>>>> Ideally, we should get Instant to handle the caching, both
> >     in-memory
> >     >>>>> and on-disk, by providing two functions __hash__ (fast, for
> >     in-memory
> >     >>>>> cache) and signature (slow, for on-disk cache).
> >     >>>>>
> >     >>>>> Since __hash__ cannot call id, it must be able to attach a
> unique
> >     >>>>> string to the form (perhaps based on an internal counter in
> FFC).
> >     >>>>> My suggestion would be to add this to UFL, something like
> set_hash
> >     >>>>> and hash (which would return None if set_hash has not been
> >     called).
> >     >>>>> If Martin does not like that, we should be able to handle it
> >     on the
> >     >>>>> DOLFIN side.
> >     >>>>>
> >     >>>>> So in conclusion: no in-memory cache in FFC (handled by
> >     Instant) and
> >     >>>>> FFC attaches a hash to incoming forms so that Instant may
> >     recognize
> >     >>>>> them later.
> >     >>>>
> >     >>>> The code that I disabled was caching preprocessed forms, so I
> >     don't see
> >     >>>> how this can be handled by Instant.
> >     >>>>
> >     >>>> Garth
> >     >>>>
> >     >>>>> Maybe even better: Instant checks whether an incoming object
> has a
> >     >>>>> set_hash function and if so calls it so it can recognize
> >     objects it
> >     >>>>> sees a second time.
> >     >>>>>
> >     >>>>> I'm moving this discussion to the mailing list(s).
> >     >>>>
> >     >>>> _______________________________________________
> >     >>>> Mailing list: https://launchpad.net/~ufl
> >     >>>> Post to     : ufl@xxxxxxxxxxxxxxxxxxx
> >     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
> >     >>>> Unsubscribe : https://launchpad.net/~ufl
> >     >>>> More help   : https://help.launchpad.net/ListHelp
> >
> >     _______________________________________________
> >     Mailing list: https://launchpad.net/~ufl
> >     Post to     : ufl@xxxxxxxxxxxxxxxxxxx <mailto:
> ufl@xxxxxxxxxxxxxxxxxxx>
> >     Unsubscribe : https://launchpad.net/~ufl
> >     More help   : https://help.launchpad.net/ListHelp
> >
> >
>

Follow ups

References