← Back to team overview

ffc team mailing list archive

Re: [Ufl] [Bug 769811] [NEW] JIT cache problem with id(form)

 


On 26/04/11 12:25, Martin Sandve Alnæs wrote:
> On 26 April 2011 11:09, Garth N. Wells <gnw20@xxxxxxxxx
> <mailto:gnw20@xxxxxxxxx>> wrote:
> 
> 
> 
>     On 26/04/11 09:56, Garth N. Wells wrote:
>     >
>     >
>     > On 26/04/11 09:03, Martin Sandve Alnæs wrote:
>     >> See other mail. I don't see that it solves anything, it doesn't seem
>     >> related to anything I've read about in this thread, and it has a
>     >> potential backside in hindering the garbage collector in Python.
>     I may
>     >> be wrong, but nobody has answered my other questions about this
>     thread yet.
>     >>
>     >
>     > As a precursor, the primary problem has nothing to do with Instant
>     disk
>     > cache, etc. The Instant discussion is just confusing the original
>     point.
>     >
>     > In summary, is it helpful if DOLFIN can avoid calling ufl.preprocess
>     > every time a dolfin.Form object is created. DOLFIN relies on
>     > preprocessing to extract the form Arguments, from which the mesh is
>     > extracted (via form_data().original_arguments, and since DOLFIN uses
>     > 'Arguments' that are subclasses of UFL and DOLFIN objects).
>     >
>     > The solution that Johan has implemented is to have FFC attach the
>     > form_data to a form. If a form has form_data attached, then we
>     know that
>     > it has already been preprocessed. Martin won't like this because it's
>     > changing the form object.
>     >
>     > It may be enough if UFL would provide a function to return a list of
>     > form Arguments, if this is fast. Something like
>     >
>     >   def extract_original_arguments(form):
>     >
>     >       # Replace arguments and coefficients with new renumbered objects
>     >       arguments, coefficients =
>     extract_arguments_and_coefficients(form)
>     >       replace_map, arguments, coefficients \
>     >             = build_argument_replace_map(arguments, coefficients)
>     >       form = replace(form, replace_map)
>     >
>     >       # Build mapping to original arguments and coefficients, which is
>     >       # useful if the original arguments have data attached to them
>     >       inv_replace_map = {}
>     >       for v, w in replace_map.iteritems():
>     >           inv_replace_map[w] = v
>     >       original_arguments = [inv_replace_map[v] for v in arguments]
>     >
>     >       return original_arguments
>     >
> 
>     As addition, I think that we're letting DOLFIN specific issues creep
>     into FFC and UFL. It would be simple if FFC simply did
> 
>        if form.form_data() is not None:
>            preprocessed_form = form
>        else:
>            preprocessed_form = preprocess(form, common_cell=common_cell)
> 
> 
> Is this ufl.preprocess or ffc.preprocess?
>  

ufl.preprocess. There is no ffc.preprocess as far as I know.

> 
>     and DOLFIN is made responsible for preprocessing a form (or not
>     preprocessing) before sending it to the FFC JIT compiler, particularly
>     since deciding to preprocess or not can depend on what DOLFIN-specific
>     data (e.g. meshes) has been attached to the form.
> 
> 
> Here I'm lost. Why is preprocessing dependent on dolfin anything?
> What kind of preprocessing are we talking about here?
> 

Because DOLFIN subclasses ufl.Coefficient, ufl.Argument, e.g. from the
DOLFIN code

  class Argument(ufl.Argument):
      def __init__(self, V, index=None):
          if not isinstance(V, FunctionSpaceBase):
              raise TypeError, "Illegal argument for creation of
	                    Argument, not a FunctionSpace: " + str(V)
          ufl.Argument.__init__(self, V.ufl_element(), index)
          self._V = V

      def function_space(self):
          "Return the FunctionSpace"
          return self._V

DOLFIN relies on UFL to extract the Arguments from a form, and then
DOLFIN calls the above member function 'function_space' to extract the
DOLFIN function space.

Garth



> Martin
> 
>  
> 
>     Garth
> 
> 
> 
> 
> 
>     > Garth
>     >
>     >
>     >> Martin
>     >>
>     >> On 26 April 2011 09:20, Garth N. Wells <gnw20@xxxxxxxxx
>     <mailto:gnw20@xxxxxxxxx>
>     >> <mailto:gnw20@xxxxxxxxx <mailto:gnw20@xxxxxxxxx>>> wrote:
>     >>
>     >>     Martin: Any problem if we apply this patch to UFL?
>     >>
>     >>     Garth
>     >>
>     >>     On 25/04/11 22:50, Johan Hake wrote:
>     >>     > This should be fixed now.
>     >>     >
>     >>     > I do not see why we introduced the memory cache when this
>     solution
>     >>     was laying
>     >>     > right in front our eyes...
>     >>     >
>     >>     > Anyhow. Here is a patch for ufl to avoid circular
>     dependency between a
>     >>     > preprocessed form and the form_data.
>     >>     >
>     >>     > Johan
>     >>     >
>     >>     > On Monday April 25 2011 14:34:00 Anders Logg wrote:
>     >>     >> Simple sounds good.
>     >>     >>
>     >>     >> --
>     >>     >> Anders
>     >>     >>
>     >>     >> On Mon, Apr 25, 2011 at 02:29:50PM -0700, Johan Hake wrote:
>     >>     >>> I am working on a simple solution, where we store
>     everything in the
>     >>     >>> original ufl form.
>     >>     >>>
>     >>     >>> I might have something soon.
>     >>     >>>
>     >>     >>> Johan
>     >>     >>>
>     >>     >>> On Monday April 25 2011 14:26:18 Garth N. Wells wrote:
>     >>     >>>> On 25/04/11 22:08, Anders Logg wrote:
>     >>     >>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells
>     wrote:
>     >>     >>>>>> On 25/04/11 20:00, Johan Hake wrote:
>     >>     >>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote:
>     >>     >>>>>>>> On 25/04/11 18:51, Anders Logg wrote:
>     >>     >>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth
>     Wells wrote:
>     >>     >>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote:
>     >>     >>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote:
>     >>     >>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote:
>     >>     >>>>>>>>>>>>> Commenting out the cache is really not a fix. The
>     >>     problem is
>     >>     >>>>>>>>>>>>> within dolfin. Isn't there another way to deal
>     with this?
>     >>     >>>>>>>>>>>>
>     >>     >>>>>>>>>>>> It is a fix if the cache isn't needed.
>     >>     >>>>>>>>>>>
>     >>     >>>>>>>>>>> Sure.
>     >>     >>>>>>>>>>>
>     >>     >>>>>>>>>>>>> First: How much penalty are there with a
>     disabled memory
>     >>     >>>>>>>>>>>>> cache. Maybe the problem isn't that bad?
>     >>     >>>>>>>>>>>>
>     >>     >>>>>>>>>>>> I don't get the point of this cache. The way it
>     is now,
>     >>     a form
>     >>     >>>>>>>>>>>> is only preprocessed if it hasn't already been
>     >>     preprocessed,
>     >>     >>>>>>>>>>>> which seems ok to me. The old code tried to
>     avoid some
>     >>     >>>>>>>>>>>> preprocessing, but it was highly dubious and I doubt
>     >>     that it
>     >>     >>>>>>>>>>>> was effective.
>     >>     >>>>>>>>>>>
>     >>     >>>>>>>>>>> I think the preprocessing stage actually do take
>     some time.
>     >>     >>>>>>>>>>> AFAIK the preproces stage essentially do two
>     things. It
>     >>     >>>>>>>>>>> creates a canonical version of the Form so two Forms
>     >>     that are
>     >>     >>>>>>>>>>> the same, but constructed at different times are
>     beeing
>     >>     >>>>>>>>>>> treated equal wrt form generation. Then are
>     DOLFIN specific
>     >>     >>>>>>>>>>> guys extracted. I am not sure what takes the most
>     time. We
>     >>     >>>>>>>>>>> should probably profiel it... But if it is the
>     latter we
>     >>     could
>     >>     >>>>>>>>>>> consider putting another cache in place which is
>     more robust
>     >>     >>>>>>>>>>> wrt changing DOLFIN objects.
>     >>     >>>>>>>>>>
>     >>     >>>>>>>>>> It should be easy to avoid the overhead of
>     preprocessing by
>     >>     >>>>>>>>>> keeping the object in scope. If the object
>     changes, the only
>     >>     >>>>>>>>>> robust way to make sure that the form is the same
>     as one
>     >>     in the
>     >>     >>>>>>>>>> cache is to compare all the data. This requires
>     preprocessing
>     >>     >>>>>>>>>> the form, which then defeats the purpose of a
>     cache. It
>     >>     may be
>     >>     >>>>>>>>>> possible to add a lightweight preprocess to UFL,
>     but I don't
>     >>     >>>>>>>>>> think that it's worth the effort or extra
>     complication.
>     >>     >>>>>>>
>     >>     >>>>>>> I think a light weight version might be the way to
>     go. This
>     >>     is then
>     >>     >>>>>>> stored in memory cache. If we are able to strip such
>     a form
>     >>     for all
>     >>     >>>>>>> DOLFIN specific things we would also prevent huge memory
>     >>     leaks with
>     >>     >>>>>>> mesh beeing kept.
>     >>     >>>>>>>
>     >>     >>>>>>> Then we always grab DOLFIN specific data from the
>     passed form
>     >>     >>>>>>> instead of grabbing from the cache. Not sure how easy
>     this
>     >>     will be
>     >>     >>>>>>> to implement, but I think we need to explore it, as
>     the DOLFIN
>     >>     >>>>>>> specific part of the form really has nothing to do
>     with the
>     >>     >>>>>>> generated Form.
>     >>     >>>>>>>
>     >>     >>>>>>> Martin:
>     >>     >>>>>>> Why is it important to have the _count in the repr of the
>     >>     form? I
>     >>     >>>>>>> guess that is used in ufl algorithms? Would it be
>     possible to
>     >>     >>>>>>> include a second repr function, which did not include
>     the count?
>     >>     >>>>>>> This would then be used when the signature is checked
>     for. We
>     >>     >>>>>>> could then use that repr to generate a form which is
>     stored
>     >>     in the
>     >>     >>>>>>> memory cache. This would then be tripped for any
>     DOLFIN specific
>     >>     >>>>>>> objects. This should work as the _count attribute has
>     nothing to
>     >>     >>>>>>> do with what code gets generated, but it is essential for
>     >>     internal
>     >>     >>>>>>> UFL algorithms, right?
>     >>     >>>>>>>
>     >>     >>>>>>>>> I'm not very happy with this change.
>     >>     >>>>>>>>
>     >>     >>>>>>>> The bright side is that slow and correct is a better
>     starting
>     >>     >>>>>>>> point than fast but wrong ;).
>     >>     >>>>>>>>
>     >>     >>>>>>>> An easy fix is to attach the preprocessed form to a Form
>     >>     object.
>     >>     >>>>>>>> This would work robustly if we can make forms
>     immutable once
>     >>     >>>>>>>> they've been compiled. Is it possible to make a
>     Python object
>     >>     >>>>>>>> immutable?
>     >>     >>>>>>>
>     >>     >>>>>>> We can probably overload all setattribtue methods which
>     >>     prohibits a
>     >>     >>>>>>> user to write to these but it might not be possible to
>     >>     prohibit a
>     >>     >>>>>>> user to change attributes on instances owned by the
>     Form. I
>     >>     guess
>     >>     >>>>>>> this is similare to the difficulties of preserving
>     constness in
>     >>     >>>>>>> C++, but I think it is even harder in Python.
>     >>     >>>>>>
>     >>     >>>>>> What if we have the FFC jit compiler return the
>     preprocessed
>     >>     form,
>     >>     >>>>>> and inside dolfin.Form simply do
>     >>     >>>>>>
>     >>     >>>>>>     class Form(cpp.Form):
>     >>     >>>>>>         def __init__(self, form, . . .. )
>     >>     >>>>>>         ....
>     >>     >>>>>>
>     >>     >>>>>>         (...., preprocessed_form) = jit(form, . . . . )
>     >>     >>>>>>
>     >>     >>>>>>         form = preprocessed_form
>     >>     >>>>>>
>     >>     >>>>>>         .....
>     >>     >>>>>>
>     >>     >>>>>> This way, form will have form_data, and the FFC jit
>     function will
>     >>     >>>>>> know not to call ufl.preprocess.
>     >>     >>>>>
>     >>     >>>>> Here's another strange thing. In the JITObject class,
>     we have two
>     >>     >>>>> functions: __hash__ and signature. As far as I
>     understand, the
>     >>     first
>     >>     >>>>> is used to located objects (generated code/modules) in
>     the Instant
>     >>     >>>>> in-memory cache, while the second is used for the
>     on-disk cache.
>     >>     >>>>>
>     >>     >>>>> >From some simple tests I did now, it looks like the
>     __hash__
>     >>     function
>     >>     >>>>>>
>     >>     >>>>> does not need to any significant speedup. The JIT benchmark
>     >>     runs just
>     >>     >>>>> as fast if I call signature from within __hash__.
>     >>     >>>>>
>     >>     >>>>> Furthermore, the __hash__ function must also be broken
>     since it
>     >>     >>>>> relies on calling id on the form.
>     >>     >>>>>
>     >>     >>>>> Ideally, we should get Instant to handle the caching, both
>     >>     in-memory
>     >>     >>>>> and on-disk, by providing two functions __hash__ (fast, for
>     >>     in-memory
>     >>     >>>>> cache) and signature (slow, for on-disk cache).
>     >>     >>>>>
>     >>     >>>>> Since __hash__ cannot call id, it must be able to
>     attach a unique
>     >>     >>>>> string to the form (perhaps based on an internal
>     counter in FFC).
>     >>     >>>>> My suggestion would be to add this to UFL, something
>     like set_hash
>     >>     >>>>> and hash (which would return None if set_hash has not been
>     >>     called).
>     >>     >>>>> If Martin does not like that, we should be able to
>     handle it
>     >>     on the
>     >>     >>>>> DOLFIN side.
>     >>     >>>>>
>     >>     >>>>> So in conclusion: no in-memory cache in FFC (handled by
>     >>     Instant) and
>     >>     >>>>> FFC attaches a hash to incoming forms so that Instant may
>     >>     recognize
>     >>     >>>>> them later.
>     >>     >>>>
>     >>     >>>> The code that I disabled was caching preprocessed forms,
>     so I
>     >>     don't see
>     >>     >>>> how this can be handled by Instant.
>     >>     >>>>
>     >>     >>>> Garth
>     >>     >>>>
>     >>     >>>>> Maybe even better: Instant checks whether an incoming
>     object has a
>     >>     >>>>> set_hash function and if so calls it so it can recognize
>     >>     objects it
>     >>     >>>>> sees a second time.
>     >>     >>>>>
>     >>     >>>>> I'm moving this discussion to the mailing list(s).
>     >>     >>>>
>     >>     >>>> _______________________________________________
>     >>     >>>> Mailing list: https://launchpad.net/~ufl
>     >>     >>>> Post to     : ufl@xxxxxxxxxxxxxxxxxxx
>     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
>     >>     <mailto:ufl@xxxxxxxxxxxxxxxxxxx <mailto:ufl@xxxxxxxxxxxxxxxxxxx>>
>     >>     >>>> Unsubscribe : https://launchpad.net/~ufl
>     >>     >>>> More help   : https://help.launchpad.net/ListHelp
>     >>
>     >>     _______________________________________________
>     >>     Mailing list: https://launchpad.net/~ufl
>     >>     Post to     : ufl@xxxxxxxxxxxxxxxxxxx
>     <mailto:ufl@xxxxxxxxxxxxxxxxxxx> <mailto:ufl@xxxxxxxxxxxxxxxxxxx
>     <mailto:ufl@xxxxxxxxxxxxxxxxxxx>>
>     >>     Unsubscribe : https://launchpad.net/~ufl
>     >>     More help   : https://help.launchpad.net/ListHelp
>     >>
>     >>
>     >
>     > _______________________________________________
>     > Mailing list: https://launchpad.net/~ufl
>     > Post to     : ufl@xxxxxxxxxxxxxxxxxxx <mailto:ufl@xxxxxxxxxxxxxxxxxxx>
>     > Unsubscribe : https://launchpad.net/~ufl
>     > More help   : https://help.launchpad.net/ListHelp
> 
> 




References