← Back to team overview

dolfin team mailing list archive

Re: [Bug 612579] Re: Huge performance problem in Python interface

 

On Thu, Aug 05, 2010 at 10:26:15PM -0000, Johan Hake wrote:
> On Thursday August 5 2010 10:08:15 Anders Logg wrote:
> > On Thu, Aug 05, 2010 at 02:27:51AM -0000, Johan Hake wrote:
> > > On Monday August 2 2010 15:00:47 Johan Hake wrote:
> > > > On Monday August 2 2010 12:05:38 Garth Wells wrote:
> > > > > On Mon, 2010-08-02 at 18:28 +0000, Anders Logg wrote:
> > > > > > On Mon, Aug 02, 2010 at 04:15:54PM -0000, Johan Hake wrote:
> > > > > > > It looks like there is something fishy with the cashing. I can
> > > > > > > have look at it
> > > > > > >
> > > > > > > Johan
> > > > > >
> > > > > > There have been some regressions in the speed of caching, probably
> > > > > > as
> > > > > >
> > > > > > a result of the FFC rewrite earlier this year. See fem-jit-python
> here:
> > > > > >    http://www.fenics.org/bench/
> > > > > >
> > > > > > I haven't bothered to examine it in detail since I thought it was
> > > > > > "good enough" but apparently not.
> > > >
> > > > Yes the problem is probably not in DOLFIN, but who knows. Looking at
> > > > the code that is provided I saw that commenting out the creation of
> > > > the DOLFIN Form within the assemble routine and instead wrapping the
> > > > Form before calling assemble made the difference that is reported. So
> > > > I thought I have a look at that first.
> > > >
> > > > > It probably is "good enough" in practice. There may be some issues
> > > > > following the fix of some memory leaks in Instant earlier this year.
> > > >
> > > > Yes I hope I do not have to go that far. But we'll see.
> > >
> > > Looks like this works fine. The module is read from memory.
> > >
> > > With some profiling it looks like most time is spent in
> > >
> > >   ufl.preprocess (with a lot of different ufl algorithms called)
> > >   instant.check_swig_version (with some file io)
> > >
> > > which are all called during ffc.jit.
> > >
> > > I guess these are functions we need to run for each jit call? We might be
> > > able to cache the swig check. However we need to do the preprocessing as
> > > it is here that we figure out the signature of the form (if I am not
> > > mistaken...).
> >
> > Maybe we shouldn't need to call preprocess (and I think we didn't do
> > this at some point in the past). Instead we can have 3 levels of caching:
> >
> > 1. Check the id() of the incoming form and directly return the
> > compiled module from memory cache (should be super fast)
> >
> > 2. Preprocess and check the signature of the incoming form and return
> > the compiled module from memory cache (can take some time)
> >
> > 3. If not in the memory cache, check disk cache (will take more time)
> >
> > 4. Otherwise build the module
> >
> > Maybe we have lost step (1) along the way.
>
> Yes I think so. Memory cache is used but _after_ the preprocessing.
>
> The problem with this is that jit return the form_data. This can only be
> accessed through a preprocessed form. I am not sure why you need to return the
> form_data?

Because it is used by DOLFIN to extract things like function spaces
from the form (by accessing the meta data in form_data). This is done
in form.py.

> We could also cache the preprocessed form. Then we should be safe :) But the
> final result might be a bit convoluted?
>
> Also, as Garth pointed out previously, this memory caching can result in
> memory leaks. For that I suggest we check the reference count of the original
> forms, and if the form does not exists more than in the cache, we can just
> remove just remove it.

Sounds good. I can revisit the caching at some point and try to speed
it up but it's not a high priority for me at the moment. (But will be
if I keep getting complaints about slow jit compilation.)

--
Anders

-- 
Huge performance problem in Python interface
https://bugs.launchpad.net/bugs/612579
You received this bug notification because you are a member of DOLFIN
Team, which is subscribed to DOLFIN.

Status in DOLFIN: Confirmed

Bug description:
In my Python Code I need to evaluate a nonlinear functional many times. This was rather slow and after a profile run, I've noticed that 90% of the time was spent in the __init__ routine of form.py to compile the form. As far as I can survey the code, this should be necessary only once. 

I have attached a simple example that illustrates the effect. In my test, the second code is roughly 40 times faster.





Follow ups

References