← Back to team overview

dhis2-users team mailing list archive

Re: long period of analytics

 

Dear Abdul Karim

It seems that you are stuck up. It is really a large number of category
option combos, and quite unusual off-course. We are using few very large
tracker but we keep category option combos below 500. We have 10 million
entity and the analytics required only 32 minutes.

Jason and Uwe are already express their opinion and I am fully in line with
these two experts.

My first suggestion is drop and recreate  _categoryoptioncomboname

Check the memory allocation and use of postgresql and let me know. Usually
the reason form delay lies in the postgres end. Earlier I have similar
situation earlier. Can you send me the tomcat memory use as well.

Regards

Hannan

On Wed, Oct 12, 2016 at 9:20 PM, Jason Pickering <
jason.p.pickering@xxxxxxxxx> wrote:

> Yeah, I am not really sure what the problem could be, but the number of
> category option combos seems extremely large to me. Its on the same order
> of magnitude as the data values. I cannot imagine that you would ever
> actually use all of them.  I think its a bit of a weakness perhaps in the
> the way these are generated for what seems like extremely highly
> disaggregated data. It may make more sense to only generate the category
> option combo, once it is used, rather than pre-generating many which will
> never be used. This is sort of the approach with periods. Periods are never
> generated until they are actually needed, otherwise, we might end up with
> many which are never actually used.
>
> Are you sure that all of those cat option combos are indeed real? It would
> seem to result from many different levels of attribution, but perhaps its
> real? If so, its really a bit of the same problem which others have faced,
> with data which is very highly disaggregated, but with many of the
> disaggregates actually never used.
>
> Not sure a good way around this really, other than confirming that none of
> the categoryoption combos are somehow orphaned.
>
> Regards,
> Jason
>
>
> On Wed, Oct 12, 2016 at 4:03 PM, Abdul karim Jaafar <
> abdulkarim.jaafar87@xxxxxxxxx> wrote:
>
>> Dears,
>> Actually we have a lot of data sets that need different category
>> combination (attribute), because we need to use filter in reporting (data
>> visualizer, pivot table) to multi data elements so reliance on event forms
>> (programs) can't be useful for project like POLIO, AIRI that we work on,
>>
>> and we use it for programs that need tracking like health information
>> system and community health programs
>> so we have too much categoryoptioncombo,
>>
>> also we have in one program over than 1300 dataelement 500 of them in
>> multi stages that make this program huge, I faced a problem because of this
>> (more than 1600 column in PostgreSQL) and I fixed it, but analytics after
>> that take more time. however analytic period problem happen before that.
>>
>> Actually I don't know what to do!
>>
>> Regadrs,
>> Abdul karim,
>>
>> -----Original Message-----
>> From: Dhis2-users [mailto:dhis2-users-bounces+abdulkarim.jaafar87=
>> gmail.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Uwe Wahser
>> Sent: Wednesday, October 12, 2016 4:06 PM
>> To: Knut Staring <knutst@xxxxxxxxx>; Jason Pickering <
>> jason.p.pickering@xxxxxxxxx>
>> Cc: dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Subject: Re: [Dhis2-users] long period of analytics
>>
>> That's a good point, Jason: a few months ago we tried under a similar
>> hardware setting to run the system with around 350.000
>> categoryOptionCombinations.
>> Although it was possible to create all of the combinations (keeping the
>> process alive for 5 days), DHIS2 started to behave funny in a lot of ways
>> (analytics worked, though).
>>
>> Better try to split your categories into categoryOptionCombinations and
>> attributeOptionCombinations. categories where you might add new options go
>> into categoryOptionCombinations, categories which will certainly not change
>> (e.g.
>> age, gender) go into attributeOptionCombinations. That way you split the
>> possible combinations into two chunks. I our example we have 52,800
>> categoryOptionCombinations and 64,800 attributeOptionCombinations (which
>> runs without a problem), where we'd theoretically have over 3 billion
>> combinations if they were combined.
>>
>> For our next data-cube we might try using anonymous events for these
>> data, as Jason suggested, which might get rid of the problem of pre-defind
>> cat-combinations.
>>
>> Regards, Uwe
>>
>> > Jason Pickering <jason.p.pickering@xxxxxxxxx> hat am 12. Oktober 2016
>> > um 15:38
>> > geschrieben:
>> >
>> >
>> > Just to confirm,
>> >
>> > _Categoryoptioncomboname: 437689 rows
>> >
>> > You would seem to have an extraordinary number of category combos. Is
>> > this correct?
>> >
>> > On Wed, Oct 12, 2016, 14:35 Knut Staring <knutst@xxxxxxxxx> wrote:
>> >
>> > > Maybe also test an upgrade?
>> > >
>> > > On 12 Oct 2016 14:22, "Uwe Wahser" <uwe@xxxxxxxxx> wrote:
>> > >
>> > > Hi Abdul,
>> > >
>> > > I saw that you posted this already a few months ago - without
>> > > success, as it seems. So let me just take a lucky guess, of what
>> > > else you could try (that's what I'd try):
>> > >
>> > > - in catalina.out, can you identify a specific phase of the
>> > > analytics run where most of the time is spent or is everything
>> > > taking longer. Maybe with that detail info the DEV team can help you
>> > > further.
>> > > - can you try to list the available database indexes in your
>> > > installation and compare with a fresh install of DHIS2.22 if any are
>> > > missing?
>> > > - can you try to rebuild/repair all the indexes in your installation
>> > > (never tried this in PostgreSQL myself, though)
>> > >
>> > > Don't know if this makes sense, but better than nothing :-)
>> > >
>> > > Regards, Uwe
>> > >
>> > >
>> > > > Abdul karim Jaafar <abdulkarim.jaafar87@xxxxxxxxx> hat am 12.
>> > > > Oktober
>> > > 2016 um
>> > > > 12:22 geschrieben:
>> > > >
>> > > >
>> > > > Dears all,
>> > > >
>> > > >
>> > > >
>> > > > I’ve been facing a problem with analytic period for over than two
>> > > > months,
>> > > >
>> > > >
>> > > >
>> > > > And I didn’t find any solution for this problem.
>> > > >
>> > > >
>> > > >
>> > > > I did a tuning for Postgres and every week I did maintains from
>> > > > data administration, also there is no problems in data integrity.
>> > > >
>> > > >
>> > > >
>> > > > But the analytic period is increasing gradually for over than 20
>> > > > hours
>> > > >
>> > > >
>> > > >
>> > > > I don’t know what are thing that increase the period for analytics
>> > > > table update.
>> > > >
>> > > >
>> > > >
>> > > > we have:
>> > > >
>> > > > Tracked entity instance: 97358
>> > > >
>> > > > Tracked entity data value: 1872790
>> > > >
>> > > > Tracked entity attribute data value: 1217314
>> > > >
>> > > > Data value: 614956
>> > > >
>> > > >
>> > > >
>> > > > Also I have these rows in table below:
>> > > >
>> > > > _Categoryoptioncomboname: 437689 rows
>> > > >
>> > > > Categoryombos_optioncombos: 669471 rows
>> > > >
>> > > >
>> > > >
>> > > > We did hosting in our own server kindly look for information below:
>> > > >
>> > > > DHIS2, Database and hosting information:
>> > > >
>> > > > Version: 2.22
>> > > >
>> > > > Build revision: 22090
>> > > >
>> > > > Environment variable:DHIS2_HOME
>> > > >
>> > > > External configuration directory:/home/dhis/config
>> > > >
>> > > > File store provider: filesystem
>> > > >
>> > > > Database type: PostgreSQL
>> > > >
>> > > > Database name: dhis2
>> > > >
>> > > > Database user: dhis
>> > > >
>> > > > Java opts: -Xmx16000m -Xms8000m
>> > > >
>> > > > Java home: /usr/lib/jvm/java-8-oracle/jre
>> > > >
>> > > > Java temporary directory: /home/dhis/tomcat-dhis/temp
>> > > >
>> > > > Java version: 1.8.0_101
>> > > >
>> > > > Java vendor: Oracle Corporation
>> > > >
>> > > > OS name: Linux
>> > > >
>> > > > OS architecture: amd64
>> > > >
>> > > > OS version: 4.4.0-34-generic
>> > > >
>> > > > Server memory: Mem Total in JVM: 10237 Free in JVM: 8507 Max
>> > > > Limit: 14222
>> > > >
>> > > > CPU cores:12
>> > > >
>> > > > Users information:
>> > > >
>> > > > There are: 63 users
>> > > >
>> > > > User role: 13
>> > > >
>> > > >
>> > > >
>> > > > I think this volume data not too much to effects the analytics.
>> > > >
>> > > >
>> > > >
>> > > > Any solutions please?
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Best regards,
>> > > >
>> > > > _______________________________________________
>> > > > Mailing list: https://launchpad.net/~dhis2-users
>> > > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> > > > Unsubscribe : https://launchpad.net/~dhis2-users
>> > > > More help   : https://help.launchpad.net/ListHelp
>> > >
>> > > _______________________________________________
>> > > Mailing list: https://launchpad.net/~dhis2-users
>> > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> > > Unsubscribe : https://launchpad.net/~dhis2-users
>> > > More help   : https://help.launchpad.net/ListHelp
>> > >
>> > > _______________________________________________
>> > > Mailing list: https://launchpad.net/~dhis2-users
>> > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> > > Unsubscribe : https://launchpad.net/~dhis2-users
>> > > More help   : https://help.launchpad.net/ListHelp
>> > >
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+46764147049
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Muhammad Abdul Hannan Khan
Secretary
HISP Bangladesh

T +880-2- 8816459, 8816412 ext 118
F +88 02 8813 875
M+88 01819 239 241
M+88 01534 312 066
E hannank@xxxxxxxxx
S hannan.khan.dhaka
B hannan-tech.blogspot.com
L https://bd.linkedin.com/in/hannankhan

References