← Back to team overview

ffc team mailing list archive

Re: [DOLFIN-dev] define a "diagonal" form

 

Alessio Quaglino wrote:
Ok, I've tried this example (my code uses a more complicate mixed
formulation though):

name = "Xs"
element2 = VectorElement("Discontinuous Lagrange", "tetrahedron", 0, 6) u = TrialFunction(element2)
psi = TestFunction(element2)
a = dot(u, psi) * dx

and the generated tensor is:

    // Compute element tensor
    A[0] = 0.166666666666666*G0_;
    A[1] = 0;
    A[2] = 0;
    A[3] = 0;
    A[4] = 0;
    A[5] = 0;
    A[6] = 0;
    A[7] = 0.166666666666666*G0_;
    A[8] = 0;
    A[9] = 0;
    A[10] = 0;
    A[11] = 0;
    A[12] = 0;
    A[13] = 0;
    A[14] = 0.166666666666666*G0_;
    A[15] = 0;
    A[16] = 0;
    A[17] = 0;
    A[18] = 0;
    A[19] = 0;
    A[20] = 0;
    A[21] = 0.166666666666666*G0_;
    A[22] = 0;
    A[23] = 0;
    A[24] = 0;
    A[25] = 0;
    A[26] = 0;
    A[27] = 0;
    A[28] = 0.166666666666666*G0_;
    A[29] = 0;
    A[30] = 0;
    A[31] = 0;
    A[32] = 0;
    A[33] = 0;
    A[34] = 0;
    A[35] = 0.166666666666666*G0_;

hence, I guess, those elements are never computed but are considered while
assembling the matrix (at least to check they are zero), while in this
case it would be faster to assemble directly a "diagonal vector", but I
think this a minor improvement.

What happens is that the zeros are not computed but they are inserted into the sparse matrix.

Also, the tensor is assembled cellwise, and I don't know how often
tabulate_tensor() is called by the outside (the assembler).

It is called once per cell.

Finally, my mixed formulation uses this "Xs" form to build the biggest
block of the matrix, however in the 324 generated components I couldn't
find the same pattern as in Xs.

Alessio

PS I'm asking all this because I find weird that the LU factorization of
the tensor is 4-5 times faster than its assembly.

The assembly should be fairly fast, but try profiling the code (valgrind --tool=callgrind and then kcachegrind) and see what happens. Try to locate the bottleneck and perhaps you can find something to improve.

/Anders


I think this is already handled. The code generated by FFC is optimized
so that things that are known to be zero a priori are never computed.
Try the simplest possible example you can think of and look at the
generated code (the function tabulate_tensor) and see if this is
correct. (And let us know what you find.)
/Anders


Alessio Quaglino wrote:
I'm wondering if it's possible to tell FFC that he doesn't have to test
all the basis functions against each other, but only against themself
once, so that I get only the diagonal terms of the matrix. I guess this
would speedup *a lot* the assembly in the case of piecewise elements
having support on only one tetrahedra, or the case when you need only
diagonal elements. Am I right or this special case is already handled?
Can
I use a special notation to achieve this aim? Thanks.

Alessio

_______________________________________________
FFC-dev mailing list
FFC-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/ffc-dev




_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev