← Back to team overview

ffc team mailing list archive

Re: [Branch ~ffc-core/ffc/main] Rev 1684: Change code generation for evaluate_basis and

 

On Mon, Sep 12, 2011 at 11:24:22PM +0200, Marie E. Rognes wrote:
> On 09/12/11 21:56, Kristian Ølgaard wrote:
> >On 12 September 2011 21:36, Marie E. Rognes<meg@xxxxxxxxx>  wrote:
> >>On 09/12/11 20:00, Marie E. Rognes wrote:
> >>>On 09/12/11 19:54, Garth N. Wells wrote:
> >>>>On 12 September 2011 18:49, Marie E. Rognes<meg@xxxxxxxxx>    wrote:
> >>>>>On 09/12/11 19:40, Garth N. Wells wrote:
> >>>>>>Which compiler options did you use when evaluating the speed up?
> >>>>>>
> >>>>>Tested Extrapolation.h with vanilla dolfin (which is dominated by
> >>>>>evaluate_basis calls). No additional compiler options set.
> >>>>>
> >>>>>What are the default compiler options?
> >>>>>
> >>>>'-g' for plain JIT, which is dead slow.  You should test with at least:
> >>>>
> >>>>     parameters["form_compiler"]["cpp_optimize"] = True
> >>>>
> >>>>in the Python code. This will use '-O2'.
> >Isn't this limited in a way? Would it be a problem to let users do:
> >
> >parameters["form_compiler"]["cpp_optimize"] = '-O2 -funroll-loops'
> >parameters["form_compiler"]["cpp_optimize"] = '-O3'
> >
> >and then perhaps let
> >
> >parameters["form_compiler"]["cpp_optimize"] = True
> >
> >default to '-O2' as we do now?
> >Just a thought.
> >
> >>>Ok, thanks -- I'll take a closer look.
> >>>
> >>Take a look at the attached results in old_evaluate_basis.txt (results with
> >>"old" FFC),
> >>and new_evaluate_basis.txt (results with "new" FFC) from running the
> >>attached
> >>test_evaluate_basis.py.
> >>
> >>Acceptable?
> >Looks good, and the generated code is much nicer now. :)
> >It could have been fun to see the impact of the '-O2 -funroll-loops'
> >option on the old code, but then you'll have to switch to C++. Anyway,
> >I'm quite sure that the old code will never perform as well as the new
> >code even with this option.
> >
> >As you have probably found out, the generated code was simply a mirror
> >of what is going on in FIAT (translated to C++).
>
> Yep.
>
> >Perhaps there are more places where we can simplify the generated code?
> >
>
> Probably, did you have anything particular in mind?
>
> One thing we could do to reduce code size
> would be to move the evaluation of the modal(?) basis functions
> outside of the switch and just do the vector-vector product inside.
>
> Also, I think it would significantly speed up evaluate_basis_all,
> if we just did the evaluation of the modal basis functions once,
> and then the vector-vector product 'local_dimension'-times.
>
> Actually, I plan on doing that unless anyone protests vehemently.
> The reduction in generated code from the one should more or less
> counteract the increase in generated code from the other.

Another thing to try would be to use BLAS to do the vector-vector
products (call ddot from BLAS) or even better if it can be written as
one big matrix-vector product (call dgemv from BLAS).

--
Anders


> >Another thing in relation to improving the evaluate_basis* functions
> >that I have thought about is if it's really necessary to support
> >derivatives of arbitrary order. If we only generate code for the first
> >derivative by default (and support arbitrary derivatives by a command
> >line argument) the code will be a lot simpler (easier on C++ compiler)
> >and much faster irrespective of which gcc optimisation is being used.
> >
>
> Sound neat to me.
>


References