dolfin team mailing list archive
-
dolfin team
-
Mailing list archive
-
Message #08803
Re: Assembly benchmark
On Mon, Jul 21, 2008 at 4:01 PM, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
>
>
> Matthew Knepley wrote:
>>
>> On Mon, Jul 21, 2008 at 3:50 PM, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
>>>
>>> Anders Logg wrote:
>>>>
>>>> On Mon, Jul 21, 2008 at 01:48:23PM +0100, Garth N. Wells wrote:
>>>>>
>>>>> Anders Logg wrote:
>>>>>>
>>>>>> I have updated the assembly benchmark to include also MTL4, see
>>>>>>
>>>>>> bench/fem/assembly/
>>>>>>
>>>>>> Here are the current results:
>>>>>>
>>>>>> Assembly benchmark | Elasticity3D PoissonP1 PoissonP2 PoissonP3
>>>>>> THStokes2D NSEMomentum3D StabStokes2D
>>>>>>
>>>>>> -------------------------------------------------------------------------------------------------------------
>>>>>> uBLAS | 9.0789 0.45645 3.8042 8.0736
>>>>>> 14.937 9.2507 3.8455
>>>>>> PETSc | 7.7758 0.42798 3.5483 7.3898
>>>>>> 13.945 8.1632 3.258
>>>>>> Epetra | 8.9516 0.45448 3.7976 8.0679
>>>>>> 15.404 9.2341 3.8332
>>>>>> MTL4 | 8.9729 0.45554 3.7966 8.0759
>>>>>> 14.94 9.2568 3.8658
>>>>>> Assembly | 7.474 0.43673 3.7341 8.3793
>>>>>> 14.633 7.6695 3.3878
>>>>>>
>>>
>>> I specified in MTL4Matrix maximum 30 nonzeroes per row, and the results
>>> change quite a bit,
>>>
>>> Assembly benchmark | Elasticity3D PoissonP1 PoissonP2 PoissonP3
>>> THStokes2D NSEMomentum3D StabStokes2D
>>>
>>>
>>> -------------------------------------------------------------------------------------------------------------
>>> uBLAS | 7.1881 0.32748 2.7633 5.8311
>>> 10.968 7.0735 2.8184
>>> PETSc | 5.7868 0.30673 2.5489 5.2344
>>> 9.8896 6.069 2.3661
>>> MTL4 | 2.8641 0.18339 1.6628 2.6811
>>> 2.8519 3.4843 0.85029
>>> Assembly | 5.5564 0.30896 2.6858 5.9675
>>> 10.622 5.7144 2.4519
>>>
>>>
>>> MTL4 is a lot faster in all cases.
>>
>> Now I don't believe the numbers. If you preallocate, we do not do any
>> extra processing
>> outside of sorting the column indices (which every format must do for
>> efficient operations).
>> Thus, how would you save any time? If these are all in seconds, I will
>> run a 2D Poisson here
>> and tell you what I get. It would help to specify sizes with this
>> benchmark :)
>>
>
> Take a look at bench/fem/assembly/ for the details.
That has to be extremely wrong. I get about 1/100th of the assembly
time. Someone has
screwed up the loop.
Matt
> Garth
>
>> Matt
>>
>>> Garth
>>>
>>>
>>>
>>>>> How was the MTL4 matrix intialised? I don't know if it does anything
>>>>> with the sparsity pattern yet. I've been intialising MTL4 matrices by
>>>>> hand so far with a guess as to the max number of nonzeroes per row.
>>>>> Without setting this, the performance is near idenetical to uBLAS. When
>>>>> it is set, I observe at least a factor two speed up.
>>>>>
>>>>> Garth
>>>>
>>>> The same way as all other backends, which is by a precomputed
>>>> sparsity pattern. It looks like this is currently ignored in the
>>>> MTL4Matrix implementation:
>>>>
>>>> void MTL4Matrix::init(const GenericSparsityPattern& sparsity_pattern)
>>>> {
>>>> init(sparsity_pattern.size(0), sparsity_pattern.size(1));
>>>> }
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> DOLFIN-dev mailing list
>>>> DOLFIN-dev@xxxxxxxxxx
>>>> http://www.fenics.org/mailman/listinfo/dolfin-dev
>>>
>>> _______________________________________________
>>> DOLFIN-dev mailing list
>>> DOLFIN-dev@xxxxxxxxxx
>>> http://www.fenics.org/mailman/listinfo/dolfin-dev
>>>
>>
>>
>>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener
References