← Back to team overview

dolfin team mailing list archive

Re: profiling an assembly

 

On Mon, May 19, 2008 at 8:34 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> On Mon, May 19, 2008 at 08:30:20AM -0500, Matthew Knepley wrote:
>> On Mon, May 19, 2008 at 8:20 AM, Anders Logg <logg@xxxxxxxxx> wrote:
>> > It looks to me like the storage needed is indeed n^2*num_cells. I'm
>> > not fluent in Fortran, but that's how I interpret this line:
>> >
>> >  atw(idxatw(el,li,lj)) = atw(idxatw(el,li,lj)) + Atmp(li,lj)
>> >
>> > This looks expensive (in terms of memory), but maybe not that
>> > expensive?
>>
>> I think I should make the aggregation point again. The above line executes
>> a function call for insertion of every value. This is a lot of
>> overhead,
>
> No, I think the above code would be very much faster than PETSc, but
> use more memory. The way I interpret it, atw is an array and idxatw is
> a *dense* rank 3 tensor so there's no searching, only lookup.

Okay, I see what the notation is now. That would mean a lot of storage, probably
more than the matrix itself.

>> not only
>> for the call, but setting up loop bounds etc. That is why MatSetValues takes
>> logical blocks, exactly what you get from FEM, I believe this could be the
>> difference between our timing results.
>
> No, the above code is not what we use in DOLFIN. We use MatSetValues
> with blocks. The above code is femLego Fortran code.

Okay, if it is blocked than we should have the same numbers.

  Matt

> --
> Anders
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


Follow ups

References