ffc team mailing list archive
-
ffc team
-
Mailing list archive
-
Message #04129
Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)
On 26/04/11 12:22, Martin Sandve Alnæs wrote:
> On 26 April 2011 10:56, Garth N. Wells <gnw20@xxxxxxxxx
> <mailto:gnw20@xxxxxxxxx>> wrote:
>
>
>
> On 26/04/11 09:03, Martin Sandve Alnæs wrote:
> > See other mail. I don't see that it solves anything, it doesn't seem
> > related to anything I've read about in this thread, and it has a
> > potential backside in hindering the garbage collector in Python. I may
> > be wrong, but nobody has answered my other questions about this
> thread yet.
> >
>
> As a precursor, the primary problem has nothing to do with Instant disk
> cache, etc. The Instant discussion is just confusing the original point.
>
> In summary, is it helpful if DOLFIN can avoid calling ufl.preprocess
> every time a dolfin.Form object is created. DOLFIN relies on
> preprocessing to extract the form Arguments, from which the mesh is
> extracted (via form_data().original_arguments, and since DOLFIN uses
> 'Arguments' that are subclasses of UFL and DOLFIN objects).
>
> The solution that Johan has implemented is to have FFC attach the
> form_data to a form. If a form has form_data attached, then we know that
> it has already been preprocessed. Martin won't like this because it's
> changing the form object.
>
>
> This sounds much like my original design. Trying to recall from my possibly
> rusty memory, I believe that calling myform.form_data() would
> construct form data only the first time and the preprocessed form could
> be retrieved from the returned form data. The form data was attached
> as myform._form_data. Thus you could always say
> preprocessed_form = myform.form_data().form
> and preprocessing would only happen once.
I think that the above would solve the issue. At the moment ufl.Form has
the member function:
def form_data(self):
"Return form metadata (None if form has not been preprocessed)"
return self._form_data
If it did
def form_data(self):
if self._form_data is None:
# compute form_data
return self._form_data
it should make things straightforward. But doesn't this violate
immutability of the form, or is it ok since the mathematical form itself
is not being modified?
Garth
> This was redesigned
> after I left to have a separate preprocess function.
>
>
> It may be enough if UFL would provide a function to return a list of
> form Arguments, if this is fast. Something like
>
> def extract_original_arguments(form):
>
> # Replace arguments and coefficients with new renumbered objects
> arguments, coefficients = extract_arguments_and_coefficients(form)
> replace_map, arguments, coefficients \
> = build_argument_replace_map(arguments, coefficients)
> form = replace(form, replace_map)
>
> # Build mapping to original arguments and coefficients, which is
> # useful if the original arguments have data attached to them
> inv_replace_map = {}
> for v, w in replace_map.iteritems():
> inv_replace_map[w] = v
> original_arguments = [inv_replace_map[v] for v in arguments]
>
> return original_arguments
>
> Garth
>
>
> I don't understand why this is needed. We:
> - must preprocess each form once
> - don't want to preprocess the same form twice
> - can obtain the original arguments after preprocessing
> This was supported a long time ago, so unless someone has
> removed functionality while I've been gone, what is the problem?
>
> I have a feeling that the source of many problems is the attempt
> to reuse forms and change mesh, functions, or elements.
> This is contrary to the design of UFL where expressions are immutable.
>
> Martin
>
>
> > Martin
> >
> > On 26 April 2011 09:20, Garth N. Wells <gnw20@xxxxxxxxx
> <mailto:gnw20@xxxxxxxxx>
> > <mailto:gnw20@xxxxxxxxx <mailto:gnw20@xxxxxxxxx>>> wrote:
> >
> > Martin: Any problem if we apply this patch to UFL?
> >
> > Garth
> >
> > On 25/04/11 22:50, Johan Hake wrote:
> > > This should be fixed now.
> > >
> > > I do not see why we introduced the memory cache when this
> solution
> > was laying
> > > right in front our eyes...
> > >
> > > Anyhow. Here is a patch for ufl to avoid circular dependency
> between a
> > > preprocessed form and the form_data.
> > >
> > > Johan
> > >
> > > On Monday April 25 2011 14:34:00 Anders Logg wrote:
> > >> Simple sounds good.
> > >>
> > >> --
> > >> Anders
> > >>
> > >> On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake wrote:
> > >>> I am working on a simple solution, where we store
> everything in the
> > >>> original ufl form.
> > >>>
> > >>> I might have something soon.
> > >>>
> > >>> Johan
> > >>>
> > >>> On Monday April 25 2011 14:26:18 Garth N. Wells wrote:
> > >>>> On 25/04/11 22:08, Anders Logg wrote:
> > >>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote:
> > >>>>>> On 25/04/11 20:00, Johan Hake wrote:
> > >>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
> > >>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
> > >>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth
> Wells wrote:
> > >>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
> > >>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
> > >>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
> > >>>>>>>>>>>>> Commenting out the cache is really not a fix. The
> > problem is
> > >>>>>>>>>>>>> within dolfin. Isn't there another way to deal
> with this?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> It is a fix if the cache isn't needed.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Sure.
> > >>>>>>>>>>>
> > >>>>>>>>>>>>> First: How much penalty are there with a
> disabled memory
> > >>>>>>>>>>>>> cache. Maybe the problem isn't that bad?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I don't get the point of this cache. The way it
> is now,
> > a form
> > >>>>>>>>>>>> is only preprocessed if it hasn't already been
> > preprocessed,
> > >>>>>>>>>>>> which seems ok to me. The old code tried to avoid
> some
> > >>>>>>>>>>>> preprocessing, but it was highly dubious and I doubt
> > that it
> > >>>>>>>>>>>> was effective.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I think the preprocessing stage actually do take
> some time.
> > >>>>>>>>>>> AFAIK the preproces stage essentially do two
> things. It
> > >>>>>>>>>>> creates a canonical version of the Form so two Forms
> > that are
> > >>>>>>>>>>> the same, but constructed at different times are
> beeing
> > >>>>>>>>>>> treated equal wrt form generation. Then are DOLFIN
> specific
> > >>>>>>>>>>> guys extracted. I am not sure what takes the most
> time. We
> > >>>>>>>>>>> should probably profiel it... But if it is the
> latter we
> > could
> > >>>>>>>>>>> consider putting another cache in place which is
> more robust
> > >>>>>>>>>>> wrt changing DOLFIN objects.
> > >>>>>>>>>>
> > >>>>>>>>>> It should be easy to avoid the overhead of
> preprocessing by
> > >>>>>>>>>> keeping the object in scope. If the object changes,
> the only
> > >>>>>>>>>> robust way to make sure that the form is the same
> as one
> > in the
> > >>>>>>>>>> cache is to compare all the data. This requires
> preprocessing
> > >>>>>>>>>> the form, which then defeats the purpose of a cache. It
> > may be
> > >>>>>>>>>> possible to add a lightweight preprocess to UFL,
> but I don't
> > >>>>>>>>>> think that it's worth the effort or extra complication.
> > >>>>>>>
> > >>>>>>> I think a light weight version might be the way to go.
> This
> > is then
> > >>>>>>> stored in memory cache. If we are able to strip such a
> form
> > for all
> > >>>>>>> DOLFIN specific things we would also prevent huge memory
> > leaks with
> > >>>>>>> mesh beeing kept.
> > >>>>>>>
> > >>>>>>> Then we always grab DOLFIN specific data from the
> passed form
> > >>>>>>> instead of grabbing from the cache. Not sure how easy this
> > will be
> > >>>>>>> to implement, but I think we need to explore it, as
> the DOLFIN
> > >>>>>>> specific part of the form really has nothing to do
> with the
> > >>>>>>> generated Form.
> > >>>>>>>
> > >>>>>>> Martin:
> > >>>>>>> Why is it important to have the _count in the repr of the
> > form? I
> > >>>>>>> guess that is used in ufl algorithms? Would it be
> possible to
> > >>>>>>> include a second repr function, which did not include
> the count?
> > >>>>>>> This would then be used when the signature is checked
> for. We
> > >>>>>>> could then use that repr to generate a form which is
> stored
> > in the
> > >>>>>>> memory cache. This would then be tripped for any
> DOLFIN specific
> > >>>>>>> objects. This should work as the _count attribute has
> nothing to
> > >>>>>>> do with what code gets generated, but it is essential for
> > internal
> > >>>>>>> UFL algorithms, right?
> > >>>>>>>
> > >>>>>>>>> I'm not very happy with this change.
> > >>>>>>>>
> > >>>>>>>> The bright side is that slow and correct is a better
> starting
> > >>>>>>>> point than fast but wrong ;).
> > >>>>>>>>
> > >>>>>>>> An easy fix is to attach the preprocessed form to a Form
> > object.
> > >>>>>>>> This would work robustly if we can make forms
> immutable once
> > >>>>>>>> they've been compiled. Is it possible to make a
> Python object
> > >>>>>>>> immutable?
> > >>>>>>>
> > >>>>>>> We can probably overload all setattribtue methods which
> > prohibits a
> > >>>>>>> user to write to these but it might not be possible to
> > prohibit a
> > >>>>>>> user to change attributes on instances owned by the
> Form. I
> > guess
> > >>>>>>> this is similare to the difficulties of preserving
> constness in
> > >>>>>>> C++, but I think it is even harder in Python.
> > >>>>>>
> > >>>>>> What if we have the FFC jit compiler return the
> preprocessed
> > form,
> > >>>>>> and inside dolfin.Form simply do
> > >>>>>>
> > >>>>>> class Form(cpp.Form):
> > >>>>>> def __init__(self, form, . . .. )
> > >>>>>> ....
> > >>>>>>
> > >>>>>> (...., preprocessed_form) = jit(form, . . . . )
> > >>>>>>
> > >>>>>> form = preprocessed_form
> > >>>>>>
> > >>>>>> .....
> > >>>>>>
> > >>>>>> This way, form will have form_data, and the FFC jit
> function will
> > >>>>>> know not to call ufl.preprocess.
> > >>>>>
> > >>>>> Here's another strange thing. In the JITObject class, we
> have two
> > >>>>> functions: __hash__ and signature. As far as I
> understand, the
> > first
> > >>>>> is used to located objects (generated code/modules) in
> the Instant
> > >>>>> in-memory cache, while the second is used for the
> on-disk cache.
> > >>>>>
> > >>>>> >From some simple tests I did now, it looks like the
> __hash__
> > function
> > >>>>>>
> > >>>>> does not need to any significant speedup. The JIT benchmark
> > runs just
> > >>>>> as fast if I call signature from within __hash__.
> > >>>>>
> > >>>>> Furthermore, the __hash__ function must also be broken
> since it
> > >>>>> relies on calling id on the form.
> > >>>>>
> > >>>>> Ideally, we should get Instant to handle the caching, both
> > in-memory
> > >>>>> and on-disk, by providing two functions __hash__ (fast, for
> > in-memory
> > >>>>> cache) and signature (slow, for on-disk cache).
> > >>>>>
> > >>>>> Since __hash__ cannot call id, it must be able to attach
> a unique
> > >>>>> string to the form (perhaps based on an internal counter
> in FFC).
> > >>>>> My suggestion would be to add this to UFL, something
> like set_hash
> > >>>>> and hash (which would return None if set_hash has not been
> > called).
> > >>>>> If Martin does not like that, we should be able to handle it
> > on the
> > >>>>> DOLFIN side.
> > >>>>>
> > >>>>> So in conclusion: no in-memory cache in FFC (handled by
> > Instant) and
> > >>>>> FFC attaches a hash to incoming forms so that Instant may
> > recognize
> > >>>>> them later.
> > >>>>
> > >>>> The code that I disabled was caching preprocessed forms, so I
> > don't see
> > >>>> how this can be handled by Instant.
> > >>>>
> > >>>> Garth
> > >>>>
> > >>>>> Maybe even better: Instant checks whether an incoming
> object has a
> > >>>>> set_hash function and if so calls it so it can recognize
> > objects it
> > >>>>> sees a second time.
> > >>>>>
> > >>>>> I'm moving this discussion to the mailing list(s).
> > >>>>
> > >>>> _______________________________________________
> > >>>> Mailing list: https://launchpad.net/~ufl
> > >>>> Post to : ufl@xxxxxxxxxxxxxxxxxxx
> <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
> > <mailto:ufl@xxxxxxxxxxxxxxxxxxx <mailto:ufl@xxxxxxxxxxxxxxxxxxx>>
> > >>>> Unsubscribe : https://launchpad.net/~ufl
> > >>>> More help : https://help.launchpad.net/ListHelp
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~ufl
> > Post to : ufl@xxxxxxxxxxxxxxxxxxx
> <mailto:ufl@xxxxxxxxxxxxxxxxxxx> <mailto:ufl@xxxxxxxxxxxxxxxxxxx
> <mailto:ufl@xxxxxxxxxxxxxxxxxxx>>
> > Unsubscribe : https://launchpad.net/~ufl
> > More help : https://help.launchpad.net/ListHelp
> >
> >
>
>
Follow ups
References