← Back to team overview

dolfin team mailing list archive

Re: [FFC-dev] define a "diagonal" form

 

> Alessio Quaglino wrote:
>> Ok, I've tried this example (my code uses a more complicate mixed
>> formulation though):
>>
>> name = "Xs"
>> element2 = VectorElement("Discontinuous Lagrange", "tetrahedron", 0, 6)
>> u
>> = TrialFunction(element2)
>> psi = TestFunction(element2)
>> a = dot(u, psi) * dx
>>
>> and the generated tensor is:
>>
>>     // Compute element tensor
>>     A[0] = 0.166666666666666*G0_;
>>     A[1] = 0;
>>     A[2] = 0;
>>     A[3] = 0;
>>     A[4] = 0;
>>     A[5] = 0;
>>     A[6] = 0;
>>     A[7] = 0.166666666666666*G0_;
>>     A[8] = 0;
>>     A[9] = 0;
>>     A[10] = 0;
>>     A[11] = 0;
>>     A[12] = 0;
>>     A[13] = 0;
>>     A[14] = 0.166666666666666*G0_;
>>     A[15] = 0;
>>     A[16] = 0;
>>     A[17] = 0;
>>     A[18] = 0;
>>     A[19] = 0;
>>     A[20] = 0;
>>     A[21] = 0.166666666666666*G0_;
>>     A[22] = 0;
>>     A[23] = 0;
>>     A[24] = 0;
>>     A[25] = 0;
>>     A[26] = 0;
>>     A[27] = 0;
>>     A[28] = 0.166666666666666*G0_;
>>     A[29] = 0;
>>     A[30] = 0;
>>     A[31] = 0;
>>     A[32] = 0;
>>     A[33] = 0;
>>     A[34] = 0;
>>     A[35] = 0.166666666666666*G0_;
>>
>> hence, I guess, those elements are never computed but are considered
>> while
>> assembling the matrix (at least to check they are zero), while in this
>> case it would be faster to assemble directly a "diagonal vector", but I
>> think this a minor improvement.
>
> What happens is that the zeros are not computed but they are inserted
> into the sparse matrix.
>
>> Also, the tensor is assembled cellwise, and I don't know how often
>> tabulate_tensor() is called by the outside (the assembler).
>
> It is called once per cell.
>
>> Finally, my mixed formulation uses this "Xs" form to build the biggest
>> block of the matrix, however in the 324 generated components I couldn't
>> find the same pattern as in Xs.
>>
>> Alessio
>>
>> PS I'm asking all this because I find weird that the LU factorization of
>> the tensor is 4-5 times faster than its assembly.
>
> The assembly should be fairly fast, but try profiling the code (valgrind
> --tool=callgrind and then kcachegrind) and see what happens. Try to
> locate the bottleneck and perhaps you can find something to improve.

It might be totally unrelated, but I've experienced that if I define
inside my solver .cpp file a class derived from uBlasLUSolver, just
copying and pasting the same UMFPACK functions, it becomes 10-15 times
slower (under both windows and linux) than the original class. I've in the
same way defined a derived class from uBlasSparseMatrix (because I needed
to add some methods), can this cause a slowdown of the code?

Alessio


> /Anders
>
>>
>>> I think this is already handled. The code generated by FFC is optimized
>> so that things that are known to be zero a priori are never computed.
>>> Try the simplest possible example you can think of and look at the
>> generated code (the function tabulate_tensor) and see if this is
>> correct. (And let us know what you find.)
>>> /Anders
>>>
>>>
>>> Alessio Quaglino wrote:
>>>> I'm wondering if it's possible to tell FFC that he doesn't have to
>>>> test
>> all the basis functions against each other, but only against themself
>> once, so that I get only the diagonal terms of the matrix. I guess this
>> would speedup *a lot* the assembly in the case of piecewise elements
>> having support on only one tetrahedra, or the case when you need only
>> diagonal elements. Am I right or this special case is already handled?
>> Can
>>>> I use a special notation to achieve this aim? Thanks.
>>>>
>>>> Alessio
>>>>
>>>> _______________________________________________
>>>> FFC-dev mailing list
>>>> FFC-dev@xxxxxxxxxx
>>>> http://www.fenics.org/mailman/listinfo/ffc-dev
>>
>>
>>
>>
>> _______________________________________________
>> DOLFIN-dev mailing list
>> DOLFIN-dev@xxxxxxxxxx
>> http://www.fenics.org/mailman/listinfo/dolfin-dev
>




References