dhis2-users team mailing list archive

Thread
Date

Re: DHIS version 2.8 released

To: Lars Helge Øverland <larshelge@xxxxxxxxx>
From: Edward Ari Bichetero <ebichete@xxxxxxxxx>
Date: Mon, 7 May 2012 02:51:42 -0700 (PDT)
Cc: "dhis2-users@xxxxxxxxxxxxxxxxxxx" <dhis2-users@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CAD_DPKxQy2PubUukd1B3jEiObqPEWFZPvmj+JruFj-gVj_rTFQ@mail.gmail.com>
Reply-to: Edward Ari Bichetero <ebichete@xxxxxxxxx>

Thanks for the reply. I'll check our Java setup, retry the import and report back.

However there is still a larger issue. Taking 16 minutes as a representative timing and assuming linear scaling (unlikely but for the sake of discussion), importing the full dataset ("District Outpatient monthly report"), of which the previous data is just a small part, would take at least 10 hours of solid computation. I'm not sure our servers would survive that kind of abuse.

And then there is "Inpatients Monthly", "Weekly Disease Outbreak" and about 4 smaller (approx. an order of magnitude) reports. This would be about 35 hours of computation, with a longer "wall clock" time. Are any of the other import methods any faster ? Do we have to resort to generating SQL statements instead ?

- Edward -
________________________________
From: Lars Helge Øverland <larshelge@xxxxxxxxx>
To: Edward Ari Bichetero <ebichete@xxxxxxxxx> 
Cc: "dhis2-users@xxxxxxxxxxxxxxxxxxx" <dhis2-users@xxxxxxxxxxxxxxxxxxx> 
Sent: Monday, May 7, 2012 10:57 AM
Subject: Re: [Dhis2-users] DHIS version 2.8 released

Hello,

I did a test here with a CSV file containing 50 000 records and it took 16 minutes.

I suspect this has to do with your Java configuration - have you e.g. set the environment variable JAVA_OPTS  to allocate memory to Java ? A reasonable value would be

-Xms500m -Xmx1000m -XX:PermSize=250m -XX:MaxPermSize=500m

Lars

On Sat, May 5, 2012 at 9:11 AM, Edward Ari Bichetero <ebichete@xxxxxxxxx> wrote:

The import file contains 46800 records (individual CSV lines).
>
>
>- Edward -
>
>
>________________________________
>From: Lars Helge Øverland <larshelge@xxxxxxxxx>
>To: Edward Ari Bichetero <ebichete@xxxxxxxxx>
>Cc: "dhis2-users@xxxxxxxxxxxxxxxxxxx" <dhis2-users@xxxxxxxxxxxxxxxxxxx>
>Sent: Friday, May 4, 2012 6:07 PM
>Subject: Re: [Dhis2-users] DHIS version 2.8 released
>
>
>
>
>
>
>On Fri, May 4, 2012 at 4:37 PM, Edward Ari Bichetero <ebichete@xxxxxxxxx> wrote:
>
>Congratulation on the new release, it is nice to see the new features especially the CSV import functionality.
>>However ...
>>
>>
>>I've just had a go at importing a small set of historical data (Monthly outpatient attendance by district) going back six years (2005-2011). This is just two data elements (Outpatient attendance, Outpatient reattendance) with four combo categories (Male, Female, Below 5 yrs old, 5 yrs and older) for each of our 112 districts. The CSV data file is about 4.6 megabytes in size.
>>
>>I gave up on watching the import process after an hour. At that point it had been using 90% of our test servers memory (4GB) and burning 100% of one cpu/core almost the entire time. This is just one of the smallest datasets we would be looking to import. It appears that the CSV import in it's current state is not able to cope with reasonably large data. Or am I getting this wrong ? Do you have any ideas/workarounds ?
>>
>>
>
>Hi, approximately how many records are in your import file?
>
>Lars
>

Follow ups

Re: DHIS version 2.8 released
From: Lars Helge Øverland, 2012-05-07

References

DHIS version 2.8 released
From: Lars Helge Øverland, 2012-04-28
Re: DHIS version 2.8 released
From: Edward Ari Bichetero, 2012-05-04
Re: DHIS version 2.8 released
From: Lars Helge Øverland, 2012-05-04
Re: DHIS version 2.8 released
From: Edward Ari Bichetero, 2012-05-05
Re: DHIS version 2.8 released
From: Lars Helge Øverland, 2012-05-07