Thread Previous • Date Previous • Date Next • Thread Next |
3. It actually matters that the default mode of FFC removes any products with zeros in the tensor product. BLAS does not know about zeros.This seems to be a severe limitation. For PDE systems with Lagrange elements (the typical case I guess), there will be lots of zeros. From previous discussions about FFC/Ferari, the conclusion was that skipping the zero elements was the dominant optimization for computing the element matrix.
This is exactly why I've been talking about inferring block structure inside the compiler and building a block 3x3 local stiffness matrix (each block might be different). For the vector Laplacian, it's block diagonal, etc. You can combine doing the set of all different blocks with level 3 BLAS (on each element) with getting to throw away all the coarse-grained zeros.
Rob
Thread Previous • Date Previous • Date Next • Thread Next |