← Back to team overview

dolfin team mailing list archive

Re: profiling an assembly

 

On Mon, May 19, 2008 at 08:42:15AM -0500, Matthew Knepley wrote:
> On Mon, May 19, 2008 at 8:34 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> > On Mon, May 19, 2008 at 08:30:20AM -0500, Matthew Knepley wrote:
> >> On Mon, May 19, 2008 at 8:20 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> >> > It looks to me like the storage needed is indeed n^2*num_cells. I'm
> >> > not fluent in Fortran, but that's how I interpret this line:
> >> >
> >> >  atw(idxatw(el,li,lj)) = atw(idxatw(el,li,lj)) + Atmp(li,lj)
> >> >
> >> > This looks expensive (in terms of memory), but maybe not that
> >> > expensive?
> >>
> >> I think I should make the aggregation point again. The above line executes
> >> a function call for insertion of every value. This is a lot of
> >> overhead,
> >
> > No, I think the above code would be very much faster than PETSc, but
> > use more memory. The way I interpret it, atw is an array and idxatw is
> > a *dense* rank 3 tensor so there's no searching, only lookup.
> 
> Okay, I see what the notation is now. That would mean a lot of storage, probably
> more than the matrix itself.
> 
> >> not only
> >> for the call, but setting up loop bounds etc. That is why MatSetValues takes
> >> logical blocks, exactly what you get from FEM, I believe this could be the
> >> difference between our timing results.
> >
> > No, the above code is not what we use in DOLFIN. We use MatSetValues
> > with blocks. The above code is femLego Fortran code.
> 
> Okay, if it is blocked than we should have the same numbers.
> 
>   Matt

Anyway, it would be good to set up a common benchmark to compare
numbers. We might still be doing something stupid in our calls to
PETSc.

-- 
Anders


References