← Back to team overview

dhis2-users team mailing list archive

Re: DHIS version 2.8 released

 

Hi again.

Thanks for making us aware of this. I have now corrected a flaw which
caused slow handling of periods for each import record.

After applying the fix my test with 50 k records are down to 20 sec. Note
that I cleared out existing records from my database - new records are
faster since we can use multiple SQL insert statements compared to single
updates.

I have also incorporated a new import option in the user interface for
"existing records check". This indicates whether the system should check
for existing data value records or not during import. If you turn it off it
means yourself is responsible for not having existing records in your
database relative to the import file. This might be useful for database
bootstraps and other situations where you are certain existing records do
not exist. With that setting turned off (skip) my test is down to 12 sec.

I have backported these fixes to 2.8. You can download the latest build
(r6830) here:

http://dhis2.org/download/releases/2.8/dhis.war

Let me know how it goes.

regards, Lars



On Mon, May 7, 2012 at 11:51 AM, Edward Ari Bichetero <ebichete@xxxxxxxxx>wrote:

> Thanks for the reply. I'll check our Java setup, retry the import and
> report back.
>
> However there is still a larger issue. Taking 16 minutes as a
> representative timing and assuming linear scaling (unlikely but for the
> sake of discussion), importing the full dataset ("District Outpatient
> monthly report"), of which the previous data is just a small part, would
> take at least 10 hours of solid computation. I'm not sure our servers would
> survive that kind of abuse.
>
> And then there is "Inpatients Monthly", "Weekly Disease Outbreak" and
> about 4 smaller (approx. an order of magnitude) reports. This would be
> about 35 hours of computation, with a longer "wall clock" time. Are any of
> the other import methods any faster ? Do we have to resort to generating
> SQL statements instead ?
>
>
> - Edward -
> ________________________________
>
>
> On Sat, May 5, 2012 at 9:11 AM, Edward Ari Bichetero <ebichete@xxxxxxxxx>
> wrote:
>
> The import file contains 46800 records (individual CSV lines).
> >
> >
> >- Edward -
> >
> >
> >________________________________
> >From: Lars Helge Øverland <larshelge@xxxxxxxxx>
> >To: Edward Ari Bichetero <ebichete@xxxxxxxxx>
> >Cc: "dhis2-users@xxxxxxxxxxxxxxxxxxx" <dhis2-users@xxxxxxxxxxxxxxxxxxx>
> >Sent: Friday, May 4, 2012 6:07 PM
> >Subject: Re: [Dhis2-users] DHIS version 2.8 released
> >
> >
> >
> >
> >
> >
> >On Fri, May 4, 2012 at 4:37 PM, Edward Ari Bichetero <ebichete@xxxxxxxxx>
> wrote:
> >
> >Congratulation on the new release, it is nice to see the new features
> especially the CSV import functionality.
> >>However ...
> >>
> >>
> >>I've just had a go at importing a small set of historical data (Monthly
> outpatient attendance by district) going back six years (2005-2011). This
> is just two data elements (Outpatient attendance, Outpatient reattendance)
> with four combo categories (Male, Female, Below 5 yrs old, 5 yrs and older)
> for each of our 112 districts. The CSV data file is about 4.6 megabytes in
> size.
> >>
> >>I gave up on watching the import process after an hour. At that point it
> had been using 90% of our test servers memory (4GB) and burning 100% of one
> cpu/core almost the entire time. This is just one of the smallest datasets
> we would be looking to import. It appears that the CSV import in it's
> current state is not able to cope with reasonably large data. Or am I
> getting this wrong ? Do you have any ideas/workarounds ?
> >>
> >>
> >
> >Hi, approximately how many records are in your import file?
> >
> >Lars
> >
>

References