← Back to team overview

ffc team mailing list archive

Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)

 

It feels good that you trust me enough to handle it. ;-)

Will add it sometime this afternoon and then we can revisit the JIT
compiler caching.

--
Anders


On Tue, Apr 26, 2011 at 01:57:40PM +0200, Martin Sandve Alnæs wrote:
> Feel free to do it :)
>
> Martin
>
>
> On 26 April 2011 13:55, Anders Logg <logg@xxxxxxxxx> wrote:
>
>     So will you add it back? I can do it since it's my fault removing it,
>     but I assume you want to handle it yourself. ;-)
>
>
>
>     On Tue, Apr 26, 2011 at 01:46:59PM +0200, Martin Sandve Aln s wrote:
>     > Caching is ok precicely _because_ the form is immutable.
>     >
>     > Martin
>     >
>     > On 26 April 2011 13:43, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
>     >
>     >
>     >
>     >     On 26/04/11 12:22, Martin Sandve Aln s wrote:
>     >     > On 26 April 2011 10:56, Garth N. Wells <gnw20@xxxxxxxxx
>     >     > <mailto:gnw20@xxxxxxxxx>> wrote:
>     >     >
>     >     >
>     >     >
>     >     >     On 26/04/11 09:03, Martin Sandve Aln s wrote:
>     >     >     > See other mail. I don't see that it solves anything, it
>     doesn't
>     >     seem
>     >     >     > related to anything I've read about in this thread, and it
>     has a
>     >     >     > potential backside in hindering the garbage collector in
>     Python. I
>     >     may
>     >     >     > be wrong, but nobody has answered my other questions about
>     this
>     >     >     thread yet.
>     >     >     >
>     >     >
>     >     >     As a precursor, the primary problem has nothing to do with
>     Instant
>     >     disk
>     >     >     cache, etc. The Instant discussion is just confusing the
>     original
>     >     point.
>     >     >
>     >     >     In summary, is it helpful if DOLFIN can avoid calling
>     ufl.preprocess
>     >     >     every time a dolfin.Form object is created. DOLFIN relies on
>     >     >     preprocessing to extract the form Arguments, from which the
>     mesh is
>     >     >     extracted (via form_data().original_arguments, and since DOLFIN
>     uses
>     >     >     'Arguments' that are subclasses of UFL and DOLFIN objects).
>     >     >
>     >     >     The solution that Johan has implemented is to have FFC attach
>     the
>     >     >     form_data to a form. If a form has form_data attached, then we
>     know
>     >     that
>     >     >     it has already been preprocessed. Martin won't like this
>     because it's
>     >     >     changing the form object.
>     >     >
>     >     >
>     >     > This sounds much like my original design. Trying to recall from my
>     >     possibly
>     >     > rusty memory, I believe that calling myform.form_data() would
>     >     > construct form data only the first time and the preprocessed form
>     could
>     >     > be retrieved from the returned form data. The form data was
>     attached
>     >     > as myform._form_data. Thus you could always say
>     >     > preprocessed_form = myform.form_data().form
>     >     > and preprocessing would only happen once.
>     >
>     >     I think that the above would solve the issue. At the moment ufl.Form
>     has
>     >     the member function:
>     >
>     >       def form_data(self):
>     >            "Return form metadata (None if form has not been preprocessed)
>     "
>     >            return self._form_data
>     >
>     >     If it did
>     >
>     >       def form_data(self):
>     >            if self._form_data is None:
>     >                # compute form_data
>     >            return self._form_data
>     >
>     >     it should make things straightforward. But doesn't this violate
>     >     immutability of the form, or is it ok since the mathematical form
>     itself
>     >     is not being modified?
>     >
>     >     Garth
>     >
>     >
>     >     > This was redesigned
>     >     > after I left to have a separate preprocess function.
>     >     >
>     >     >
>     >     >     It may be enough if UFL would provide a function to return a
>     list of
>     >     >     form Arguments, if this is fast. Something like
>     >     >
>     >     >      def extract_original_arguments(form):
>     >     >
>     >     >          # Replace arguments and coefficients with new renumbered
>     objects
>     >     >          arguments, coefficients =
>     extract_arguments_and_coefficients
>     >     (form)
>     >     >          replace_map, arguments, coefficients \
>     >     >                = build_argument_replace_map(arguments,
>     coefficients)
>     >     >          form = replace(form, replace_map)
>     >     >
>     >     >          # Build mapping to original arguments and coefficients,
>     which is
>     >     >          # useful if the original arguments have data attached to
>     them
>     >     >          inv_replace_map = {}
>     >     >          for v, w in replace_map.iteritems():
>     >     >              inv_replace_map[w] = v
>     >     >          original_arguments = [inv_replace_map[v] for v in
>     arguments]
>     >     >
>     >     >          return original_arguments
>     >     >
>     >     >     Garth
>     >     >
>     >     >
>     >     > I don't understand why this is needed. We:
>     >     > - must preprocess each form once
>     >     > - don't want to preprocess the same form twice
>     >     > - can obtain the original arguments after preprocessing
>     >     > This was supported a long time ago, so unless someone has
>     >     > removed functionality while I've been gone, what is the problem?
>     >     >
>     >     > I have a feeling that the source of many problems is the attempt
>     >     > to reuse forms and change mesh, functions, or elements.
>     >     > This is contrary to the design of UFL where expressions are
>     immutable.
>     >     >
>     >     > Martin
>     >     >
>     >     >
>     >     >     > Martin
>     >     >     >
>     >     >     > On 26 April 2011 09:20, Garth N. Wells <gnw20@xxxxxxxxx
>     >     >     <mailto:gnw20@xxxxxxxxx>
>     >     >     > <mailto:gnw20@xxxxxxxxx <mailto:gnw20@xxxxxxxxx>>> wrote:
>     >     >     >
>     >     >     >     Martin: Any problem if we apply this patch to UFL?
>     >     >     >
>     >     >     >     Garth
>     >     >     >
>     >     >     >     On 25/04/11 22:50, Johan Hake wrote:
>     >     >     >     > This should be fixed now.
>     >     >     >     >
>     >     >     >     > I do not see why we introduced the memory cache when
>     this
>     >     >     solution
>     >     >     >     was laying
>     >     >     >     > right in front our eyes...
>     >     >     >     >
>     >     >     >     > Anyhow. Here is a patch for ufl to avoid circular
>     dependency
>     >     >     between a
>     >     >     >     > preprocessed form and the form_data.
>     >     >     >     >
>     >     >     >     > Johan
>     >     >     >     >
>     >     >     >     > On Monday April 25 2011 14:34:00 Anders Logg wrote:
>     >     >     >     >> Simple sounds good.
>     >     >     >     >>
>     >     >     >     >>
>     >     >     >     >> On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake
>     wrote:
>     >     >     >     >>> I am working on a simple solution, where we store
>     >     >     everything in the
>     >     >     >     >>> original ufl form.
>     >     >     >     >>>
>     >     >     >     >>> I might have something soon.
>     >     >     >     >>>
>     >     >     >     >>> Johan
>     >     >     >     >>>
>     >     >     >     >>> On Monday April 25 2011 14:26:18 Garth N. Wells
>     wrote:
>     >     >     >     >>>> On 25/04/11 22:08, Anders Logg wrote:
>     >     >     >     >>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth
>     Wells
>     >     wrote:
>     >     >     >     >>>>>> On 25/04/11 20:00, Johan Hake wrote:
>     >     >     >     >>>>>>> On Monday April 25 2011 11:26:36 Garth Wells
>     wrote:
>     >     >     >     >>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
>     >     >     >     >>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth
>     >     >     Wells wrote:
>     >     >     >     >>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
>     >     >     >     >>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells
>     wrote:
>     >     >     >     >>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
>     >     >     >     >>>>>>>>>>>>> Commenting out the cache is really not a
>     fix. The
>     >     >     >     problem is
>     >     >     >     >>>>>>>>>>>>> within dolfin. Isn't there another way to
>     deal
>     >     >     with this?
>     >     >     >     >>>>>>>>>>>>
>     >     >     >     >>>>>>>>>>>> It is a fix if the cache isn't needed.
>     >     >     >     >>>>>>>>>>>
>     >     >     >     >>>>>>>>>>> Sure.
>     >     >     >     >>>>>>>>>>>
>     >     >     >     >>>>>>>>>>>>> First: How much penalty are there with a
>     >     >     disabled memory
>     >     >     >     >>>>>>>>>>>>> cache. Maybe the problem isn't that bad?
>     >     >     >     >>>>>>>>>>>>
>     >     >     >     >>>>>>>>>>>> I don't get the point of this cache. The way
>     it
>     >     >     is now,
>     >     >     >     a form
>     >     >     >     >>>>>>>>>>>> is only preprocessed if it hasn't already
>     been
>     >     >     >     preprocessed,
>     >     >     >     >>>>>>>>>>>> which seems ok to me. The old code tried to
>     avoid
>     >     >     some
>     >     >     >     >>>>>>>>>>>> preprocessing, but it was highly dubious and
>     I
>     >     doubt
>     >     >     >     that it
>     >     >     >     >>>>>>>>>>>> was effective.
>     >     >     >     >>>>>>>>>>>
>     >     >     >     >>>>>>>>>>> I think the preprocessing stage actually do
>     take
>     >     >     some time.
>     >     >     >     >>>>>>>>>>> AFAIK the preproces stage essentially do two
>     >     >     things. It
>     >     >     >     >>>>>>>>>>> creates a canonical version of the Form so
>     two
>     >     Forms
>     >     >     >     that are
>     >     >     >     >>>>>>>>>>> the same, but constructed at different times
>     are
>     >     >     beeing
>     >     >     >     >>>>>>>>>>> treated equal wrt form generation. Then are
>     DOLFIN
>     >     >     specific
>     >     >     >     >>>>>>>>>>> guys extracted. I am not sure what takes the
>     most
>     >     >     time. We
>     >     >     >     >>>>>>>>>>> should probably profiel it... But if it is
>     the
>     >     >     latter we
>     >     >     >     could
>     >     >     >     >>>>>>>>>>> consider putting another cache in place which
>     is
>     >     >     more robust
>     >     >     >     >>>>>>>>>>> wrt changing DOLFIN objects.
>     >     >     >     >>>>>>>>>>
>     >     >     >     >>>>>>>>>> It should be easy to avoid the overhead of
>     >     >     preprocessing by
>     >     >     >     >>>>>>>>>> keeping the object in scope. If the object
>     changes,
>     >     >     the only
>     >     >     >     >>>>>>>>>> robust way to make sure that the form is the
>     same
>     >     >     as one
>     >     >     >     in the
>     >     >     >     >>>>>>>>>> cache is to compare all the data. This
>     requires
>     >     >     preprocessing
>     >     >     >     >>>>>>>>>> the form, which then defeats the purpose of a
>     cache.
>     >     It
>     >     >     >     may be
>     >     >     >     >>>>>>>>>> possible to add a lightweight preprocess to
>     UFL,
>     >     >     but I don't
>     >     >     >     >>>>>>>>>> think that it's worth the effort or extra
>     >     complication.
>     >     >     >     >>>>>>>
>     >     >     >     >>>>>>> I think a light weight version might be the way
>     to go.
>     >     >     This
>     >     >     >     is then
>     >     >     >     >>>>>>> stored in memory cache. If we are able to strip
>     such a
>     >     >     form
>     >     >     >     for all
>     >     >     >     >>>>>>> DOLFIN specific things we would also prevent huge
>     >     memory
>     >     >     >     leaks with
>     >     >     >     >>>>>>> mesh beeing kept.
>     >     >     >     >>>>>>>
>     >     >     >     >>>>>>> Then we always grab DOLFIN specific data from the
>     >     >     passed form
>     >     >     >     >>>>>>> instead of grabbing from the cache. Not sure how
>     easy
>     >     this
>     >     >     >     will be
>     >     >     >     >>>>>>> to implement, but I think we need to explore it,
>     as
>     >     >     the DOLFIN
>     >     >     >     >>>>>>> specific part of the form really has nothing to
>     do
>     >     >     with the
>     >     >     >     >>>>>>> generated Form.
>     >     >     >     >>>>>>>
>     >     >     >     >>>>>>> Martin:
>     >     >     >     >>>>>>> Why is it important to have the _count in the
>     repr of
>     >     the
>     >     >     >     form? I
>     >     >     >     >>>>>>> guess that is used in ufl algorithms? Would it be
>     >     >     possible to
>     >     >     >     >>>>>>> include a second repr function, which did not
>     include
>     >     >     the count?
>     >     >     >     >>>>>>> This would then be used when the signature is
>     checked
>     >     >     for. We
>     >     >     >     >>>>>>> could then use that repr to generate a form which
>     is
>     >     >     stored
>     >     >     >     in the
>     >     >     >     >>>>>>> memory cache. This would then be tripped for any
>     >     >     DOLFIN specific
>     >     >     >     >>>>>>> objects. This should work as the _count attribute
>     has
>     >     >     nothing to
>     >     >     >     >>>>>>> do with what code gets generated, but it is
>     essential
>     >     for
>     >     >     >     internal
>     >     >     >     >>>>>>> UFL algorithms, right?
>     >     >     >     >>>>>>>
>     >     >     >     >>>>>>>>> I'm not very happy with this change.
>     >     >     >     >>>>>>>>
>     >     >     >     >>>>>>>> The bright side is that slow and correct is a
>     better
>     >     >     starting
>     >     >     >     >>>>>>>> point than fast but wrong ;).
>     >     >     >     >>>>>>>>
>     >     >     >     >>>>>>>> An easy fix is to attach the preprocessed form
>     to a
>     >     Form
>     >     >     >     object.
>     >     >     >     >>>>>>>> This would work robustly if we can make forms
>     >     >     immutable once
>     >     >     >     >>>>>>>> they've been compiled. Is it possible to make a
>     >     >     Python object
>     >     >     >     >>>>>>>> immutable?
>     >     >     >     >>>>>>>
>     >     >     >     >>>>>>> We can probably overload all setattribtue methods
>     which
>     >     >     >     prohibits a
>     >     >     >     >>>>>>> user to write to these but it might not be
>     possible to
>     >     >     >     prohibit a
>     >     >     >     >>>>>>> user to change attributes on instances owned by
>     the
>     >     >     Form. I
>     >     >     >     guess
>     >     >     >     >>>>>>> this is similare to the difficulties of
>     preserving
>     >     >     constness in
>     >     >     >     >>>>>>> C++, but I think it is even harder in Python.
>     >     >     >     >>>>>>
>     >     >     >     >>>>>> What if we have the FFC jit compiler return the
>     >     >     preprocessed
>     >     >     >     form,
>     >     >     >     >>>>>> and inside dolfin.Form simply do
>     >     >     >     >>>>>>
>     >     >     >     >>>>>>     class Form(cpp.Form):
>     >     >     >     >>>>>>         def __init__(self, form, . . .. )
>     >     >     >     >>>>>>         ....
>     >     >     >     >>>>>>
>     >     >     >     >>>>>>         (...., preprocessed_form) = jit(form, . .
>     . . )
>     >     >     >     >>>>>>
>     >     >     >     >>>>>>         form = preprocessed_form
>     >     >     >     >>>>>>
>     >     >     >     >>>>>>         .....
>     >     >     >     >>>>>>
>     >     >     >     >>>>>> This way, form will have form_data, and the FFC
>     jit
>     >     >     function will
>     >     >     >     >>>>>> know not to call ufl.preprocess.
>     >     >     >     >>>>>
>     >     >     >     >>>>> Here's another strange thing. In the JITObject
>     class, we
>     >     >     have two
>     >     >     >     >>>>> functions: __hash__ and signature. As far as I
>     >     >     understand, the
>     >     >     >     first
>     >     >     >     >>>>> is used to located objects (generated code/modules)
>     in
>     >     >     the Instant
>     >     >     >     >>>>> in-memory cache, while the second is used for the
>     >     >     on-disk cache.
>     >     >     >     >>>>>
>     >     >     >     >>>>> >From some simple tests I did now, it looks like
>     the
>     >     >     __hash__
>     >     >     >     function
>     >     >     >     >>>>>>
>     >     >     >     >>>>> does not need to any significant speedup. The JIT
>     >     benchmark
>     >     >     >     runs just
>     >     >     >     >>>>> as fast if I call signature from within __hash__.
>     >     >     >     >>>>>
>     >     >     >     >>>>> Furthermore, the __hash__ function must also be
>     broken
>     >     >     since it
>     >     >     >     >>>>> relies on calling id on the form.
>     >     >     >     >>>>>
>     >     >     >     >>>>> Ideally, we should get Instant to handle the
>     caching,
>     >     both
>     >     >     >     in-memory
>     >     >     >     >>>>> and on-disk, by providing two functions __hash__
>     (fast,
>     >     for
>     >     >     >     in-memory
>     >     >     >     >>>>> cache) and signature (slow, for on-disk cache).
>     >     >     >     >>>>>
>     >     >     >     >>>>> Since __hash__ cannot call id, it must be able to
>     attach
>     >     >     a unique
>     >     >     >     >>>>> string to the form (perhaps based on an internal
>     counter
>     >     >     in FFC).
>     >     >     >     >>>>> My suggestion would be to add this to UFL,
>     something
>     >     >     like set_hash
>     >     >     >     >>>>> and hash (which would return None if set_hash has
>     not
>     >     been
>     >     >     >     called).
>     >     >     >     >>>>> If Martin does not like that, we should be able to
>     handle
>     >     it
>     >     >     >     on the
>     >     >     >     >>>>> DOLFIN side.
>     >     >     >     >>>>>
>     >     >     >     >>>>> So in conclusion: no in-memory cache in FFC
>     (handled by
>     >     >     >     Instant) and
>     >     >     >     >>>>> FFC attaches a hash to incoming forms so that
>     Instant may
>     >     >     >     recognize
>     >     >     >     >>>>> them later.
>     >     >     >     >>>>
>     >     >     >     >>>> The code that I disabled was caching preprocessed
>     forms,
>     >     so I
>     >     >     >     don't see
>     >     >     >     >>>> how this can be handled by Instant.
>     >     >     >     >>>>
>     >     >     >     >>>> Garth
>     >     >     >     >>>>
>     >     >     >     >>>>> Maybe even better: Instant checks whether an
>     incoming
>     >     >     object has a
>     >     >     >     >>>>> set_hash function and if so calls it so it can
>     recognize
>     >     >     >     objects it
>     >     >     >     >>>>> sees a second time.
>     >     >     >     >>>>>
>     >     >     >     >>>>> I'm moving this discussion to the mailing list(s).
>     >     >     >     >>>>
>     >     >     >     >>>> _______________________________________________
>     >     >     >     >>>> Mailing list: https://launchpad.net/~ufl
>     >     >     >     >>>> Post to     : ufl@xxxxxxxxxxxxxxxxxxx
>     >     >     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
>     >     >     >     <mailto:ufl@xxxxxxxxxxxxxxxxxxx <mailto:
>     ufl@xxxxxxxxxxxxxxxxxxx
>     >     >>
>     >     >     >     >>>> Unsubscribe : https://launchpad.net/~ufl
>     >     >     >     >>>> More help   : https://help.launchpad.net/ListHelp
>     >     >     >
>     >     >     >     _______________________________________________
>     >     >     >     Mailing list: https://launchpad.net/~ufl
>     >     >     >     Post to     : ufl@xxxxxxxxxxxxxxxxxxx
>     >     >     <mailto:ufl@xxxxxxxxxxxxxxxxxxx> <mailto:
>     ufl@xxxxxxxxxxxxxxxxxxx
>     >     >     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>>
>     >     >     >     Unsubscribe : https://launchpad.net/~ufl
>     >     >     >     More help   : https://help.launchpad.net/ListHelp
>     >     >     >
>     >     >     >
>     >     >
>     >     >
>     >
>     >
>     >
>
>     > _______________________________________________
>     > Mailing list: https://launchpad.net/~ufl
>     > Post to     : ufl@xxxxxxxxxxxxxxxxxxx
>     > Unsubscribe : https://launchpad.net/~ufl
>     > More help   : https://help.launchpad.net/ListHelp
>
>
>



Follow ups

References