← Back to team overview

ffc team mailing list archive

Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)

 


On 26/04/11 09:03, Martin Sandve Alnæs wrote:
> See other mail. I don't see that it solves anything, it doesn't seem
> related to anything I've read about in this thread, and it has a
> potential backside in hindering the garbage collector in Python. I may
> be wrong, but nobody has answered my other questions about this thread yet.
>

As a precursor, the primary problem has nothing to do with Instant disk
cache, etc. The Instant discussion is just confusing the original point.

In summary, is it helpful if DOLFIN can avoid calling ufl.preprocess
every time a dolfin.Form object is created. DOLFIN relies on
preprocessing to extract the form Arguments, from which the mesh is
extracted (via form_data().original_arguments, and since DOLFIN uses
'Arguments' that are subclasses of UFL and DOLFIN objects).

The solution that Johan has implemented is to have FFC attach the
form_data to a form. If a form has form_data attached, then we know that
it has already been preprocessed. Martin won't like this because it's
changing the form object.

It may be enough if UFL would provide a function to return a list of
form Arguments, if this is fast. Something like

  def extract_original_arguments(form):

      # Replace arguments and coefficients with new renumbered objects
      arguments, coefficients = extract_arguments_and_coefficients(form)
      replace_map, arguments, coefficients \
            = build_argument_replace_map(arguments, coefficients)
      form = replace(form, replace_map)

      # Build mapping to original arguments and coefficients, which is
      # useful if the original arguments have data attached to them
      inv_replace_map = {}
      for v, w in replace_map.iteritems():
          inv_replace_map[w] = v
      original_arguments = [inv_replace_map[v] for v in arguments]

      return original_arguments 	

Garth


> Martin
> 
> On 26 April 2011 09:20, Garth N. Wells <gnw20@xxxxxxxxx
> <mailto:gnw20@xxxxxxxxx>> wrote:
> 
>     Martin: Any problem if we apply this patch to UFL?
> 
>     Garth
> 
>     On 25/04/11 22:50, Johan Hake wrote:
>     > This should be fixed now.
>     >
>     > I do not see why we introduced the memory cache when this solution
>     was laying
>     > right in front our eyes...
>     >
>     > Anyhow. Here is a patch for ufl to avoid circular dependency between a
>     > preprocessed form and the form_data.
>     >
>     > Johan
>     >
>     > On Monday April 25 2011 14:34:00 Anders Logg wrote:
>     >> Simple sounds good.
>     >>
>     >> --
>     >> Anders
>     >>
>     >> On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake wrote:
>     >>> I am working on a simple solution, where we store everything in the
>     >>> original ufl form.
>     >>>
>     >>> I might have something soon.
>     >>>
>     >>> Johan
>     >>>
>     >>> On Monday April 25 2011 14:26:18 Garth N. Wells wrote:
>     >>>> On 25/04/11 22:08, Anders Logg wrote:
>     >>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote:
>     >>>>>> On 25/04/11 20:00, Johan Hake wrote:
>     >>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
>     >>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
>     >>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth Wells wrote:
>     >>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
>     >>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
>     >>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
>     >>>>>>>>>>>>> Commenting out the cache is really not a fix. The
>     problem is
>     >>>>>>>>>>>>> within dolfin. Isn't there another way to deal with this?
>     >>>>>>>>>>>>
>     >>>>>>>>>>>> It is a fix if the cache isn't needed.
>     >>>>>>>>>>>
>     >>>>>>>>>>> Sure.
>     >>>>>>>>>>>
>     >>>>>>>>>>>>> First: How much penalty are there with a disabled memory
>     >>>>>>>>>>>>> cache. Maybe the problem isn't that bad?
>     >>>>>>>>>>>>
>     >>>>>>>>>>>> I don't get the point of this cache. The way it is now,
>     a form
>     >>>>>>>>>>>> is only preprocessed if it hasn't already been
>     preprocessed,
>     >>>>>>>>>>>> which seems ok to me. The old code tried to avoid some
>     >>>>>>>>>>>> preprocessing, but it was highly dubious and I doubt
>     that it
>     >>>>>>>>>>>> was effective.
>     >>>>>>>>>>>
>     >>>>>>>>>>> I think the preprocessing stage actually do take some time.
>     >>>>>>>>>>> AFAIK the preproces stage essentially do two things. It
>     >>>>>>>>>>> creates a canonical version of the Form so two Forms
>     that are
>     >>>>>>>>>>> the same, but constructed at different times are beeing
>     >>>>>>>>>>> treated equal wrt form generation. Then are DOLFIN specific
>     >>>>>>>>>>> guys extracted. I am not sure what takes the most time. We
>     >>>>>>>>>>> should probably profiel it... But if it is the latter we
>     could
>     >>>>>>>>>>> consider putting another cache in place which is more robust
>     >>>>>>>>>>> wrt changing DOLFIN objects.
>     >>>>>>>>>>
>     >>>>>>>>>> It should be easy to avoid the overhead of preprocessing by
>     >>>>>>>>>> keeping the object in scope. If the object changes, the only
>     >>>>>>>>>> robust way to make sure that the form is the same as one
>     in the
>     >>>>>>>>>> cache is to compare all the data. This requires preprocessing
>     >>>>>>>>>> the form, which then defeats the purpose of a cache. It
>     may be
>     >>>>>>>>>> possible to add a lightweight preprocess to UFL, but I don't
>     >>>>>>>>>> think that it's worth the effort or extra complication.
>     >>>>>>>
>     >>>>>>> I think a light weight version might be the way to go. This
>     is then
>     >>>>>>> stored in memory cache. If we are able to strip such a form
>     for all
>     >>>>>>> DOLFIN specific things we would also prevent huge memory
>     leaks with
>     >>>>>>> mesh beeing kept.
>     >>>>>>>
>     >>>>>>> Then we always grab DOLFIN specific data from the passed form
>     >>>>>>> instead of grabbing from the cache. Not sure how easy this
>     will be
>     >>>>>>> to implement, but I think we need to explore it, as the DOLFIN
>     >>>>>>> specific part of the form really has nothing to do with the
>     >>>>>>> generated Form.
>     >>>>>>>
>     >>>>>>> Martin:
>     >>>>>>> Why is it important to have the _count in the repr of the
>     form? I
>     >>>>>>> guess that is used in ufl algorithms? Would it be possible to
>     >>>>>>> include a second repr function, which did not include the count?
>     >>>>>>> This would then be used when the signature is checked for. We
>     >>>>>>> could then use that repr to generate a form which is stored
>     in the
>     >>>>>>> memory cache. This would then be tripped for any DOLFIN specific
>     >>>>>>> objects. This should work as the _count attribute has nothing to
>     >>>>>>> do with what code gets generated, but it is essential for
>     internal
>     >>>>>>> UFL algorithms, right?
>     >>>>>>>
>     >>>>>>>>> I'm not very happy with this change.
>     >>>>>>>>
>     >>>>>>>> The bright side is that slow and correct is a better starting
>     >>>>>>>> point than fast but wrong ;).
>     >>>>>>>>
>     >>>>>>>> An easy fix is to attach the preprocessed form to a Form
>     object.
>     >>>>>>>> This would work robustly if we can make forms immutable once
>     >>>>>>>> they've been compiled. Is it possible to make a Python object
>     >>>>>>>> immutable?
>     >>>>>>>
>     >>>>>>> We can probably overload all setattribtue methods which
>     prohibits a
>     >>>>>>> user to write to these but it might not be possible to
>     prohibit a
>     >>>>>>> user to change attributes on instances owned by the Form. I
>     guess
>     >>>>>>> this is similare to the difficulties of preserving constness in
>     >>>>>>> C++, but I think it is even harder in Python.
>     >>>>>>
>     >>>>>> What if we have the FFC jit compiler return the preprocessed
>     form,
>     >>>>>> and inside dolfin.Form simply do
>     >>>>>>
>     >>>>>>     class Form(cpp.Form):
>     >>>>>>         def __init__(self, form, . . .. )
>     >>>>>>         ....
>     >>>>>>
>     >>>>>>         (...., preprocessed_form) = jit(form, . . . . )
>     >>>>>>
>     >>>>>>         form = preprocessed_form
>     >>>>>>
>     >>>>>>         .....
>     >>>>>>
>     >>>>>> This way, form will have form_data, and the FFC jit function will
>     >>>>>> know not to call ufl.preprocess.
>     >>>>>
>     >>>>> Here's another strange thing. In the JITObject class, we have two
>     >>>>> functions: __hash__ and signature. As far as I understand, the
>     first
>     >>>>> is used to located objects (generated code/modules) in the Instant
>     >>>>> in-memory cache, while the second is used for the on-disk cache.
>     >>>>>
>     >>>>> >From some simple tests I did now, it looks like the __hash__
>     function
>     >>>>>>
>     >>>>> does not need to any significant speedup. The JIT benchmark
>     runs just
>     >>>>> as fast if I call signature from within __hash__.
>     >>>>>
>     >>>>> Furthermore, the __hash__ function must also be broken since it
>     >>>>> relies on calling id on the form.
>     >>>>>
>     >>>>> Ideally, we should get Instant to handle the caching, both
>     in-memory
>     >>>>> and on-disk, by providing two functions __hash__ (fast, for
>     in-memory
>     >>>>> cache) and signature (slow, for on-disk cache).
>     >>>>>
>     >>>>> Since __hash__ cannot call id, it must be able to attach a unique
>     >>>>> string to the form (perhaps based on an internal counter in FFC).
>     >>>>> My suggestion would be to add this to UFL, something like set_hash
>     >>>>> and hash (which would return None if set_hash has not been
>     called).
>     >>>>> If Martin does not like that, we should be able to handle it
>     on the
>     >>>>> DOLFIN side.
>     >>>>>
>     >>>>> So in conclusion: no in-memory cache in FFC (handled by
>     Instant) and
>     >>>>> FFC attaches a hash to incoming forms so that Instant may
>     recognize
>     >>>>> them later.
>     >>>>
>     >>>> The code that I disabled was caching preprocessed forms, so I
>     don't see
>     >>>> how this can be handled by Instant.
>     >>>>
>     >>>> Garth
>     >>>>
>     >>>>> Maybe even better: Instant checks whether an incoming object has a
>     >>>>> set_hash function and if so calls it so it can recognize
>     objects it
>     >>>>> sees a second time.
>     >>>>>
>     >>>>> I'm moving this discussion to the mailing list(s).
>     >>>>
>     >>>> _______________________________________________
>     >>>> Mailing list: https://launchpad.net/~ufl
>     >>>> Post to     : ufl@xxxxxxxxxxxxxxxxxxx
>     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
>     >>>> Unsubscribe : https://launchpad.net/~ufl
>     >>>> More help   : https://help.launchpad.net/ListHelp
> 
>     _______________________________________________
>     Mailing list: https://launchpad.net/~ufl
>     Post to     : ufl@xxxxxxxxxxxxxxxxxxx <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
>     Unsubscribe : https://launchpad.net/~ufl
>     More help   : https://help.launchpad.net/ListHelp
> 
> 



Follow ups

References