yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #00302
Re: Note on optimized compilation / optimized coding / profiling results
Václav S a écrit :
>> The damping dispatcher takes 89% of cpu time to do
>> something else, namely :
>> - LocateMultivirtualFunctor : 50%
>>
> Good to know.
>
Let me precise this : I'm not sure the dispatching mecanism is slow by
itself. What takes a lot of time in the dispatcher is for instance
BodyMacroParameters::getBaseClassIndex() (around 50%), or the function
find() I spoke about before (10%). I'm attaching a file with the list of
function costs in LocateMultivirtualFunctor .
Note that getBaseClassIndex() is 10 times slower that getClassIndex(), I
wonder why...
Bruno
> Another way would be to have boost::ublas::compressed_matrix for the 2D
> case, if 2D array of some 10000 elements is too much (OK, if we have
> 1000 classes once, that would mean 1e6 which is not nice). Should be
> reasonably fast as well. No 3D functors, though ;-).
>
> Will play with that if I find some time. That would sound like big speed
> improvement (there is quite a few dispatchers in the simulation, if they
> take 30% of time, that means 15% for dispatching... wow.
>
> I thought originally this stuff was part of Loki and was well optimized.
>
>
>> A simple example :
>>
>> shared_ptr<PhysicalAction>& PhysicalActionVectorVector::find(unsigned
>> int id , int actionIndex )
>> {
>> if( current_size <= id ) // this is very rarely executed, only at
>> beginning.
>> // somebody is accesing out of bounds, make sure he will find, what
>> he needs - a resetted PhysicalAction of his type
>> {
>> ....
>> }
>> usedIds[id] = true;
>> return physicalActions[id][actionIndex];
>> }
>>
>> 1. There is this test at the begining, that is useless all the time
>> except at iteration 1.
>> 2. there is this usedIds[id] affectation (same remark again, only
>> modified at iteration 1).
>> 3. Then return a reference to a shared pointer (while shared_ptr
>> operations are slower than normal).
>>
>>
> You're right, I thought that 1. and 2. was the business of ::prepare,
> which is called from PhysicalActionContainerInitializer. It should be
> the user's responsibility to use valid indices, right?
>
> UsedIds, that's a different story; it says what index contains a
> non-null action so that we can iterate over non-null actions. But this
> flag could be put into the action instance itself. Then
> physicalActions[id][actionIndex] could be used directly.
>
>> Of course, this function "find" is used many many times. Result 3.81% of
>> total CPU time just for it, while we could just use
>> physicalActions[id][actionIndex] instead of find(), and reduce that time
>> a lot. But of course physicalActions is private...
>>
>>
> Just change that in the header and make it public, no problem. If the
> "user" screws stuff, it it his problem, not mine. I think Olivier liked
> having lot of stuff private, but then we have overheads for for
> accessors. Sadly, c++ has no way to say: this member is public for
> reading and private for writing, which could help in many cases.
>
>> Janek : I know that containers will be changed anyway, and that is why I
>> think this is the right moment to cry about speed. :)
>> Users don't care about Godwin's law, they need SPEED!!!
>>
>>
> Cosurgi, should I fiddle with PhysicalActionContainer or is it going to
> be changed anyway?
>
>> Yes, I thought to that too, but I was not expert enough to be sure.
>> Can't the compiler check that nothing will be affected in some situations?
>>
>>
> It can in some, but I don't know in which ones. Putting const everywhere
> also checks that there are no side-effects of the method on its
> instance, for example.
>
> Another thing is that many methods should be inlined, but grepping
> through headers reveals that only very small subset is: for the most
> part, the openGl wrapper. Inlining functions is turned on by the -O3
> flag to g++, so we don't gain nothing here probably. But gcc manpage
> says for -finline-limit: " This option is particularly useful for
> programs that use inlining heavily such as those based on recursive
> templates with C++." which is just our case.
>
> And by the way, I just found out that assert(...) is not compiled-out in
> optimized builds of scons, since we don't define NDEBUG. Will be fixed
> in next commit. No big deal ;-)
> _______________________________________________
> yade-dev mailing list
> yade-dev@xxxxxxxxxxxxxxxx
> https://lists.berlios.de/mailman/listinfo/yade-dev
>
>
--
_______________
Chareyre Bruno
Maitre de conference
Institut National Polytechnique de Grenoble
Laboratoire 3S (Soils Solids Structures) - bureau E145
BP 53 - 38041, Grenoble cedex 9 - France
Tél : 33 4 56 52 86 21
Fax : 33 4 76 82 70 43
________________
_______________________________________________
yade-dev mailing list
yade-dev@xxxxxxxxxxxxxxxx
https://lists.berlios.de/mailman/listinfo/yade-dev
Follow ups
References