← Back to team overview

dolfin team mailing list archive

Re: MTL4 backend: Significant performance results

 

kent-and@xxxxxxxxx wrote:
> Sounds amazing!
> 
> I'd like to see that code although I can not promise you to
> much response during my holiday, which is starting tomorrow.
> 
> Have you compared matrix vector product with vector products using uBlas
> or PETSc ?

Will do.

More good news: After some discussion about insertion operations on the
MTL list, Peter Gottschling (the lead developer of MTL4) implemented
some optimizations based on some other bechmarks I ran. For insertion
into very sparse matrices (like Poisson) I got a further 30% speedup.

I could put the MTL4 experimental stuff in the sandbox. Does that sound
good? I'm going on vacation too.

/Dag

> 
> Kent
> 
> 
>> Hello!
>>
>> In light of the long and interesting discussion we had a while ago about
>> assembler performance I decided to try to squeeze more out of the uBlas
>> backend. This was not very successful.
>>
>> However, I've been following the development of MTL4
>> (http://www.osl.iu.edu/research/mtl/mtl4/) with a keen eye on the
>> interesting insertion scheme they provide. I implemented a backend --
>> without sparsity pattern computation -- for the dolfin assembler and here
>> are some first benchmarks results:
>>
>> Incomp Navier Stokes on 50x50x50 unit cube
>>
>> MTL --------------------------------------------------------
>> assembly time: 8.510000
>> reassembly time: 6.750000
>> vecor assembly time: 6.070000
>>
>> memory: 230 mb
>>
>> UBLAS ------------------------------------------------------
>> assembly time: 23.030000
>> reassembly time: 12.140000
>> vector assembly time: 6.030000
>>
>> memory: 642 mb
>>
>> Poisson on 2000x2000 unit square
>>
>> MTL --------------------------------------------------------
>> assembly time: 9.520000
>> reassembly time: 6.650000
>> assembly time: 4.730000
>> vector linear solve: 0.000000
>>
>> memory: 452 mb
>>
>> UBLAS ------------------------------------------------------
>> assembly time: 15.400000
>> reassembly time: 7.520000
>> vector assembly time: 5.020000
>>
>> memory: 1169 mb
>>
>> Conclusions? MTL is more than twice as fast and allocates less than half
>> the memory (since there is no sparsity pattern computation) across a set
>> of forms I've tested.
>>
>> The code is not perfectly done yet, but I'd still be happy to share it
>> with whoever wants to mess around with it.
>>
>> Cheers!
>>
>> /Dag
>>
>> _______________________________________________
>> DOLFIN-dev mailing list
>> DOLFIN-dev@xxxxxxxxxx
>> http://www.fenics.org/mailman/listinfo/dolfin-dev
>>
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature


Follow ups

References