← Back to team overview

dhis2-devs team mailing list archive

Re: Problems with detailed metadata export..

 

Sorry Jason.  A quick look through isnt shedding light on me :-(

Is this a test environment or did you run that against a running server
with other users?  There doesn't seem to be any tight loop like we saw with
the paging bug, so I guess what we have here is not so much as a logical
error as some sort of gross inefficiency somewhere.
One thing I notice is that almost all RUNNABLE threads are somehow engaged
with operations relating to user service.  getCurrentUser,
getUserCredentials etc.  Given that this is a static snapshot in time of
where exactly each thread is at, it seems odd that so many are caught up
chasing users instead of doing real work.  This might be a red herring or a
pointer to that gross inefficiency - it might be that if the user
credentials are being retrieved and tested each time the user is touching a
metadata object (and it it is maybe touching many of them with this
metadata export) that that is leading to excessive activity.  Do you have a
vast number of users in the system by any chance?

On 22 January 2015 at 08:09, Jason Pickering <jason.p.pickering@xxxxxxxxx>
wrote:

> OK, is this better?  Stack attached.
>
> Tasks: 178 total,   1 running, 177 sleeping,   0 stopped,   0 zombie
> Cpu(s): 99.1%us,  0.1%sy,  0.0%ni,  0.4%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.3%st
> Mem:  16433100k total, 16330904k used,   102196k free,    71828k buffers
> Swap:   262140k total,    25108k used,   237032k free, 11411252k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 28091 dhis      20   0 12.0g 3.6g 5916 S  796 23.0 328:48.52 java
>
>
> Command used was
>
> curl --verbose "
> http://XXXX/dhis/api/dataElements.json?links=false&paging=false"; -u
> XXXX:XXXX
>
> Regards,
> Jason
>
>
> On Wed, Jan 21, 2015 at 11:05 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx>
> wrote:
>
>> Jason the cpu is not really "through the roof" here.  Can you try another
>> dump (while the cpu is through the roof)?
>>
>> The interesting threads to look at are the http-bio-8080 threads, which
>> are the tomcat worker threads.  If the server is not being accessed by
>> dozens of users most of them should be sitting in a WAITING state.  So then
>> it is interesting to look at those which are marked as RUNNABLE.  These are
>> the ones which are trying as hard as they can to get the attention of the
>> scheduler in order to chew cpu cycles.
>>
>> So if your process has become cpu bound then sometimes its possible to
>> figure out which thread it is and what it is trying to do.  This attached
>> stack dump doesn't reveal much unfortunately, but then again the cpu (from
>> top) is showing 62% idle.  So try and get the cpu burning up again and grab
>> 2 or 3 jstack samples.  If it is cpu bound then it might show.
>>
>>
>>
>> On 21 January 2015 at 19:28, Jason Pickering <jason.p.pickering@xxxxxxxxx
>> > wrote:
>>
>>> Not really sure what is going on, but here is the stack dump. Maybe it
>>> will help.
>>>
>>> And some more info below.
>>>
>>>
>>> top - 21:26:40 up 477 days, 16:20,  2 users,  load average: 2.99, 2.05,
>>> 1.03
>>> Tasks: 171 total,   1 running, 170 sleeping,   0 stopped,   0 zombie
>>> Cpu(s): 38.3%us,  0.0%sy,  0.0%ni, 61.6%id,  0.0%wa,  0.0%hi,  0.0%si,
>>>  0.1%st
>>> Mem:  16433100k total, 15416512k used,  1016588k free,    86276k buffers
>>> Swap:   262140k total,    18872k used,   243268k free, 12412748k cached
>>>
>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> 28091 dhis      20   0 11.8g 2.1g  13m S  307 13.2  29:39.99 java
>>>
>>>
>>> Regards,
>>> Jason
>>>
>>>
>>> On Wed, Jan 21, 2015 at 7:42 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>> wrote:
>>>
>>>> If it is running for 15 minutes then almost certainly it is stuck in
>>>> some kind of loop.  Find the pid of the running jvm and also the owner of
>>>> the java process.  Then run:
>>>>
>>>> sudo -u <process owner> jstack <pid>  > jstack.txt
>>>>
>>>> That might give an idea of what the threads are up to ...
>>>>
>>>> On 21 January 2015 at 18:26, Morten Olav Hansen <mortenoh@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> 18k is not really that much.. is it giving you any kind of errors?
>>>>> given the SL demo db... we can easily export much more objects that that..
>>>>>
>>>>> --
>>>>> Morten
>>>>>
>>>>> On Thu, Jan 22, 2015 at 12:42 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> Can you try and get a jstack dump while it is busy chomping cpu.
>>>>>> That will probably indicate where in the code it is spinning.  There was a
>>>>>> problem earlier with a weakness in the paging code but should have been
>>>>>> fixed by that build.
>>>>>>
>>>>>> On 21 January 2015 at 15:00, Jason Pickering <
>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>
>>>>>>>
>>>>>>> In one DHIS2 instance, we have a rather large number of data
>>>>>>> elements, more than 18,000. During the detailed metadata export, a request
>>>>>>> is made to "api/dataElements.json?paging=false&links=false". This throws
>>>>>>> the CPU of the server through the roof, with the operation taking many
>>>>>>> minutes (has been running for 15 minutes now with no response).There are no
>>>>>>> errors in the logs or the browser. Build revision is not known, but  it is
>>>>>>> 2.17 from from 2015-01-17 02:03.
>>>>>>>
>>>>>>> Any idea what might be going on here?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Jason
>>>>>>>
>>>>>>> --
>>>>>>> Jason P. Pickering
>>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>>> tel:+46764147049
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Jason P. Pickering
>>> email: jason.p.pickering@xxxxxxxxx
>>> tel:+46764147049
>>>
>>
>>
>
>
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+46764147049
>

Follow ups

References