dolfin team mailing list archive
-
dolfin team
-
Mailing list archive
-
Message #05536
Re: [FFC-dev] define a "diagonal" form
> Alessio Quaglino wrote:
>> Ok, I've tried this example (my code uses a more complicate mixed
>> formulation though):
>>
>> name = "Xs"
>> element2 = VectorElement("Discontinuous Lagrange", "tetrahedron", 0, 6)
>> u
>> = TrialFunction(element2)
>> psi = TestFunction(element2)
>> a = dot(u, psi) * dx
>>
>> and the generated tensor is:
>>
>> // Compute element tensor
>> A[0] = 0.166666666666666*G0_;
>> A[1] = 0;
>> A[2] = 0;
>> A[3] = 0;
>> A[4] = 0;
>> A[5] = 0;
>> A[6] = 0;
>> A[7] = 0.166666666666666*G0_;
>> A[8] = 0;
>> A[9] = 0;
>> A[10] = 0;
>> A[11] = 0;
>> A[12] = 0;
>> A[13] = 0;
>> A[14] = 0.166666666666666*G0_;
>> A[15] = 0;
>> A[16] = 0;
>> A[17] = 0;
>> A[18] = 0;
>> A[19] = 0;
>> A[20] = 0;
>> A[21] = 0.166666666666666*G0_;
>> A[22] = 0;
>> A[23] = 0;
>> A[24] = 0;
>> A[25] = 0;
>> A[26] = 0;
>> A[27] = 0;
>> A[28] = 0.166666666666666*G0_;
>> A[29] = 0;
>> A[30] = 0;
>> A[31] = 0;
>> A[32] = 0;
>> A[33] = 0;
>> A[34] = 0;
>> A[35] = 0.166666666666666*G0_;
>>
>> hence, I guess, those elements are never computed but are considered
>> while
>> assembling the matrix (at least to check they are zero), while in this
>> case it would be faster to assemble directly a "diagonal vector", but I
>> think this a minor improvement.
>
> What happens is that the zeros are not computed but they are inserted
> into the sparse matrix.
>
>> Also, the tensor is assembled cellwise, and I don't know how often
>> tabulate_tensor() is called by the outside (the assembler).
>
> It is called once per cell.
>
>> Finally, my mixed formulation uses this "Xs" form to build the biggest
>> block of the matrix, however in the 324 generated components I couldn't
>> find the same pattern as in Xs.
>>
>> Alessio
>>
>> PS I'm asking all this because I find weird that the LU factorization of
>> the tensor is 4-5 times faster than its assembly.
>
> The assembly should be fairly fast, but try profiling the code (valgrind
> --tool=callgrind and then kcachegrind) and see what happens. Try to
> locate the bottleneck and perhaps you can find something to improve.
It might be totally unrelated, but I've experienced that if I define
inside my solver .cpp file a class derived from uBlasLUSolver, just
copying and pasting the same UMFPACK functions, it becomes 10-15 times
slower (under both windows and linux) than the original class. I've in the
same way defined a derived class from uBlasSparseMatrix (because I needed
to add some methods), can this cause a slowdown of the code?
Alessio
> /Anders
>
>>
>>> I think this is already handled. The code generated by FFC is optimized
>> so that things that are known to be zero a priori are never computed.
>>> Try the simplest possible example you can think of and look at the
>> generated code (the function tabulate_tensor) and see if this is
>> correct. (And let us know what you find.)
>>> /Anders
>>>
>>>
>>> Alessio Quaglino wrote:
>>>> I'm wondering if it's possible to tell FFC that he doesn't have to
>>>> test
>> all the basis functions against each other, but only against themself
>> once, so that I get only the diagonal terms of the matrix. I guess this
>> would speedup *a lot* the assembly in the case of piecewise elements
>> having support on only one tetrahedra, or the case when you need only
>> diagonal elements. Am I right or this special case is already handled?
>> Can
>>>> I use a special notation to achieve this aim? Thanks.
>>>>
>>>> Alessio
>>>>
>>>> _______________________________________________
>>>> FFC-dev mailing list
>>>> FFC-dev@xxxxxxxxxx
>>>> http://www.fenics.org/mailman/listinfo/ffc-dev
>>
>>
>>
>>
>> _______________________________________________
>> DOLFIN-dev mailing list
>> DOLFIN-dev@xxxxxxxxxx
>> http://www.fenics.org/mailman/listinfo/dolfin-dev
>
References