← Back to team overview

dolfin team mailing list archive

Re: Assembly benchmark

 

On Mon, Jul 21, 2008 at 3:50 PM, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
>
>
> Anders Logg wrote:
>> On Mon, Jul 21, 2008 at 01:48:23PM +0100, Garth N. Wells wrote:
>>>
>>> Anders Logg wrote:
>>>> I have updated the assembly benchmark to include also MTL4, see
>>>>
>>>>    bench/fem/assembly/
>>>>
>>>> Here are the current results:
>>>>
>>>> Assembly benchmark  |  Elasticity3D  PoissonP1  PoissonP2  PoissonP3  THStokes2D  NSEMomentum3D  StabStokes2D
>>>> -------------------------------------------------------------------------------------------------------------
>>>> uBLAS               |        9.0789    0.45645     3.8042     8.0736  14.937         9.2507        3.8455
>>>> PETSc               |        7.7758    0.42798     3.5483     7.3898  13.945         8.1632         3.258
>>>> Epetra              |        8.9516    0.45448     3.7976     8.0679  15.404         9.2341        3.8332
>>>> MTL4                |        8.9729    0.45554     3.7966     8.0759  14.94          9.2568        3.8658
>>>> Assembly            |         7.474    0.43673     3.7341     8.3793  14.633         7.6695        3.3878
>>>>
>
>
> I specified in MTL4Matrix maximum 30 nonzeroes per row, and the results
> change quite a bit,
>
>  Assembly benchmark  |  Elasticity3D  PoissonP1  PoissonP2  PoissonP3
> THStokes2D  NSEMomentum3D  StabStokes2D
>
> -------------------------------------------------------------------------------------------------------------
>  uBLAS               |        7.1881    0.32748     2.7633     5.8311
>     10.968         7.0735        2.8184
>  PETSc               |        5.7868    0.30673     2.5489     5.2344
>     9.8896          6.069        2.3661
>  MTL4                |        2.8641    0.18339     1.6628     2.6811
>     2.8519         3.4843       0.85029
>  Assembly            |        5.5564    0.30896     2.6858     5.9675
>     10.622         5.7144        2.4519
>
>
> MTL4 is a lot faster in all cases.

Now I don't believe the numbers. If you preallocate, we do not do any
extra processing
outside of sorting the column indices (which every format must do for
efficient operations).
Thus, how would you save any time? If these are all in seconds, I will
run a 2D Poisson here
and tell you what I get. It would help to specify sizes with this benchmark :)

  Matt

> Garth
>
>
>
>>> How was the MTL4 matrix intialised? I don't know if it does anything
>>> with the sparsity pattern yet. I've been intialising MTL4 matrices by
>>> hand so far with a guess as to the max number of nonzeroes per row.
>>> Without setting this, the performance is near idenetical to uBLAS. When
>>> it is set, I observe at least a factor two speed up.
>>>
>>> Garth
>>
>> The same way as all other backends, which is by a precomputed
>> sparsity pattern. It looks like this is currently ignored in the
>> MTL4Matrix implementation:
>>
>> void MTL4Matrix::init(const GenericSparsityPattern& sparsity_pattern)
>> {
>>   init(sparsity_pattern.size(0), sparsity_pattern.size(1));
>> }
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> DOLFIN-dev mailing list
>> DOLFIN-dev@xxxxxxxxxx
>> http://www.fenics.org/mailman/listinfo/dolfin-dev
>
>
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


Follow ups

References