← Back to team overview

dhis2-devs team mailing list archive

Re: [Dhis2-users] 25 hours in completing Analytic

 

Neeraj,

Thanks for that, it is useful info.

The instance has a very skewed balance between data elements/catcombos and
indicators, but what else is new - outside of South Africa, most countries
have the same: they collect a huge number or data elements but turn very
few of them into indicators (SA collect very very few data elements that
are NOT part of any indicator). But that is a separate discussion.

I would NOT expect the number of data elements & catcombos + the number of
orgunits to have a significant impact on analytics performance, but that
needs to be verified.

I WOULD expect the number of indicators to have a significant impact, so if
you doubled or quadrupled the number of indicators I would expect a
significant impact on analytics time

Using your and my numbers as a baseline, I would envision the following
tests to be revealing:

Baseline system 1: SSD, 4-core CPU: 100 mill values, 400 DEs, 250
indicators, 30,000 OUs, few catcombos -> 1 hour for analytics processing

Baseline system 2: SSD, 12-core CPU: 500 mill values, 3,500 DEs, 15,000
OUs, 25 catcombos -> ~10 hours for analytics processing

Test 1: Using system 1, reduce OUs to 15,000 - run analytics (hypothesis:
no significant difference)
Test 2: Using system 1, reduce indicators to 125 - run analytics, then to 0
- run analytics (hypothesis: analytics reduced to 45 min)
Test 3: Using system 1, increase datavalues from 100mill to 500 mill by
introducing additional attributecombos (easy) - run analytics (hypothesis:
~15 hours for analytics processing, with 4-cores instead of 12-cores)
Test 4: Using system 1, increase datavalues to 1 billion - run analytics
(hypothesis: 35 hours for analytics processing).

Neeraj, I don't know if you have a similar sandbox server available for
testing, but if you do:

Test 5: Using system 2, increase number of indicators to 600 (just export
the 200 you have, modify the names and uids a bit, and re-import - for the
purpose of this test, it does not matter that many of the indicators have
identical formulas). Hypothesis: Analytics time up at least 20%
Test 6: using system 2 (with 200 indicators), shift half of the 500 mill
data values forward or backward in time resulting in twice the number of
analytics tables at half the size. (hypothesis: slight reduction in
processing time)
Test 7: Using system 2 (with 200 indicators), increase datavalues to 1
billion using a new attributecombo - run analytics (hypothesis: analytics
time up to 25 hours again).

As far as I've seen, the bulk of analytics processing time is taken up by
indexing - which I would expect to follow a moderately quadratic curve in
terms of analytics table sizes.

Best regards
Calle

On 24 October 2016 at 10:39, Neeraj Gupta <neeraj.hisp@xxxxxxxxx> wrote:

> Calle,
>
> We have around 500 million record in database with 3666 data elements
> having 26 category combinations and 201 indicators and there are 14398
> organisation units.
>
> Thanks,
> Neeraj
>
> On Mon, Oct 24, 2016 at 1:50 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
> wrote:
>
>> Neeraj,
>>
>> You never stated the number of records you have in the datavalue table -
>> what is it?
>>
>> In the same context: anybody have a rough idea of how many datavalue
>> records there are in the global DATIM database - which I think currently
>> might be the largest DHIS2 instance around?
>>
>> Given our own recent work on performance + what Neeraj has reported, I've
>> been thinking of creating one test instance with let us say 500 mill
>> datavalue records and another with let us say 1 billion, then use them to
>> identify key bottlenecks in various processes AND use them to ensure that
>> DHIS2 analytics performance is as linear as possible in terms of database
>> size. Postgresql has introduced a number of new indexing algorithms in
>> recent versions, and I'm not sure if DHIS2 is taking full advantage of them.
>>
>> Best regards
>> Calle
>>
>> On 24 October 2016 at 07:53, Brajesh Murari <brajesh.murari@xxxxxxxxx>
>> wrote:
>>
>>> Congratulation Neeraj and team ...it much appreciated
>>>
>>> On Mon, Oct 24, 2016 at 11:08 AM, Neeraj Gupta <neeraj.hisp@xxxxxxxxx>
>>> wrote:
>>>
>>>> Dear Team, Thanks for all your suggestions.
>>>>
>>>> Now the time of analytic is reduced to 10 hours 41 minutes.
>>>>
>>>> We tried to VACUUM as Sam suggested but it didn't help then we upgraded
>>>> postgres from 9.4 to 9.5.4 and as Calle and Bob suggested we made some
>>>> changes in configuration file of postgres and it reduced the time. But the
>>>> database size is still same.
>>>>
>>>> Thanks for all your help!
>>>>
>>>> Thanks,
>>>> Neeraj
>>>>
>>>> On Wed, Oct 19, 2016 at 6:46 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx
>>>> > wrote:
>>>>
>>>>> Neeraj,
>>>>>
>>>>> It's always an element of uncertainty linked to database sizes - ref
>>>>> Sam's post over. So indicating the number of records you have in the
>>>>> datavalue table & key meta-data tables would be useful + indicating whether
>>>>> you are running other instances on the same server. Some comments - I've
>>>>> been doing a lot of similar optimising work recently:
>>>>>
>>>>> 1. Upgrading to 9.5.4 is strongly recommended (and don't use 9.6
>>>>> before the worst bugs are fixed and it has stabilised).
>>>>>
>>>>> 2. Carefully check your postgres.conf against the recommended
>>>>> settings. The guide is a bit superficial in the sense that it has
>>>>> recommended "fixed" values only and no explanations around ranges below or
>>>>> above those, but you can experiment a bit yourself (e.g. the recommended
>>>>> "max_connections = 200" might not be sufficient for a really large system
>>>>> like what you have.
>>>>>
>>>>> 3. If your server is running that single instance only, then 48GB or
>>>>> RAM should be sufficient. Our servers are all having 128GB RAM so we
>>>>> experimented quite a bit earlier this year with giving a DHIS2 instance
>>>>> large amounts or RAM (up to 60-70gb), with negligible impact on
>>>>> performance. According to Lars, the DHIS2 cannot really utilize more than
>>>>> around 16gb RAM (at least that is how I understood his communication at the
>>>>> time). So 48GB should be sufficient for a single instance.
>>>>>
>>>>> 4. I've been doing performance optimizing recently on an instance with
>>>>> - 4-core server with 2x 512gb ssd, 12gb allocated to DHIS2
>>>>> - 31,000 Orgunits
>>>>> - 420 data elements
>>>>> - 250 indicators
>>>>> - around 100 mill datavalue records
>>>>> - total size around 140gb with analytics tables.
>>>>>
>>>>> So the size is only 25% of your 500GB, but RUNNING ANALYTICS ON THAT
>>>>> DATABASE INSTANCE IS TAKING JUST OVER 1 HOUR. Fundamentally, if the
>>>>> analytics engine is designed well, I would expect a nearly linear
>>>>> relationship between database size and the time analytics takes to run. So
>>>>> running analytics on your database on our server should in theory take 4-5
>>>>> hours.
>>>>>
>>>>> We are obviously comparing oranges and nectarines here, in the sense
>>>>> that there might be other aspects of our server and database that is
>>>>> different from yours (type of CPU, no of OUs, no of DEs/Indicators, whether
>>>>> your instance have lots of tracker data, etc etc). I have not seen any
>>>>> scientific/quantified comparative performance values related to specific
>>>>> parameters like number of CPUs and/or number of cores, but 12 cores SHOULD
>>>>> improve analytics performance quite a bit - assuming around 30% then it
>>>>> means running analytics on your database/server should take around 3
>>>>> hours......
>>>>>
>>>>> I tried getting comparative, quantitative data on various
>>>>> configurations of hardware and software (e.g. some users prefer CentOS,
>>>>> others Ubuntu) during the academy in August, but did not get much - it
>>>>> seems most users/providers have found a setup that works for them for now
>>>>> and nobody is doing any systematic performance testing (some of the
>>>>> international NGOs/companies using DHIS2 might have, but as with internally
>>>>> developed apps they are not that keen on sharing). So it would be highly
>>>>> appreciated if you would post the results on analytics time with every
>>>>> upgrade / tweak you do - starting with the upgrade to Pg 9.5.4
>>>>>
>>>>> Best regards
>>>>> Calle
>>>>>
>>>>> On 19 October 2016 at 13:28, Sam Johnson <samuel.johnson@xxxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> Hi Neeraj,
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Using VACUUM and ANALYZE*
>>>>>>
>>>>>>
>>>>>>
>>>>>> Like Brajesh, my background is MySQL, and one database admin task
>>>>>> that is often overlooked in MySQL is OPTIMIZE TABLEs.  This reclaims unused
>>>>>> space (we’ve had 100Gb databases files drop to half their size) and
>>>>>> refreshes index statistics (if the shape of your data has changed over
>>>>>> time, this can make indices run faster).
>>>>>>
>>>>>>
>>>>>>
>>>>>> I’m new to PostgreSQL, but the core principles are the same, and a
>>>>>> quick bit of Googling shows that the equivalents in PostgreSQL are the
>>>>>> VACUUM and ANALYZE commands.  If your database isn’t set to automatically
>>>>>> do VACUUMs (the default DHIS2 postgres config doesn’t seem to be), you
>>>>>> might want to try VACUUM *FULL*, which will literally rewrite all of
>>>>>> your database tables and indices into smaller, more efficient files (note,
>>>>>> however, that on a 500Gb database this could take a *looong* time –
>>>>>> perhaps test on a backup first?).  The following forum post is a really
>>>>>> nice, plain-English explanation of what VACUUM does:
>>>>>>
>>>>>> http://dba.stackexchange.com/questions/126258/what-is-table-
>>>>>> bloating-in-databases
>>>>>>
>>>>>>
>>>>>>
>>>>>> As I mentioned, my background is MySQL rather than Postgres, so
>>>>>> someone with more specific Postgres experience might like to also chime in
>>>>>> here.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers, Sam.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From: *Dhis2-users <dhis2-users-bounces+samuel.johnson=
>>>>>> qebo.co.uk@xxxxxxxxxxxxxxxxxxx> on behalf of Brajesh Murari <
>>>>>> brajesh.murari@xxxxxxxxx>
>>>>>> *Date: *Wednesday, 19 October 2016 at 08:28
>>>>>> *To: *Knut Staring <knutst@xxxxxxxxx>
>>>>>> *Cc: *DHIS 2 Users list <dhis2-users@xxxxxxxxxxxxxxxxxxx>, DHIS2
>>>>>> Developers <dhis2-devs@xxxxxxxxxxxxxxxxxxx>
>>>>>> *Subject: *Re: [Dhis2-users] [Dhis2-devs] 25 hours in completing
>>>>>> Analytic
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dear Neeraj,
>>>>>>
>>>>>>
>>>>>>
>>>>>> The physical database size doesn't matter much, even the number of
>>>>>> records don't matter. In my experience the biggest problem that one can
>>>>>> going to run in to is not size, but the number of queries you can handle at
>>>>>> a time instance specially during analytic functionality execution. Most
>>>>>> probably you should going to have to move to a master/slave configuration
>>>>>> of your database, so that the read queries can run against the slaves and
>>>>>> the write queries run against the master. However, if you and your database
>>>>>> management team are not ready for this than, you can tweak your indexes for
>>>>>> the queries you are running to speed up the response times. Also there is a
>>>>>> lot of tweaking you can do to the network stack and kernel in Linux where
>>>>>> MySQL Server has been installed that will help.Perhaps, I would focus first
>>>>>> on your indexes, then have a server admin look at your OS, and if all that
>>>>>> doesn't help it might be time to implement a master/slave configuration.
>>>>>> The most important scalability factor is RAM. If the indexes of your tables
>>>>>> fit into memory and your queries are highly optimized in analytic
>>>>>> functionality, you can serve a reasonable amount of requests with a average
>>>>>> machine. The number of records do matter, depending of how your tables look
>>>>>> like. It's a difference to have a lot of varchar fields or only a couple of
>>>>>> ints or longs. The physical size of the database matters as well,  think of
>>>>>> backups, for instance. Depending on your engine, your physical db files on
>>>>>> grow, but don't shrink, for instance with innodb. So deleting a lot of
>>>>>> rows, doesn't help to shrink your physical files. Thus the database size
>>>>>> does matter. If you have more than one table with more than a million
>>>>>> records, then performance starts indeed to degrade. Indexig is one of the
>>>>>> important stand need to take care, If you hit one million records you will
>>>>>> get performance problems, if the indices are not set right (for example no
>>>>>> indices for fields in "WHERE statements" or "ON conditions" in joins). If
>>>>>> you hit 10 million records, you will start to get performance problems even
>>>>>> if you have all your indices right. Hardware upgrades - adding more memory
>>>>>> and more processor power, especially memory - often help to reduce the most
>>>>>> severe problems by increasing the performance again, at least to a certain
>>>>>> degree.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 19, 2016 at 12:35 PM, Knut Staring <knutst@xxxxxxxxx>
>>>>>> wrote:
>>>>>>
>>>>>> Just a heads-up that there seems to be a JDBC issue with Postgres
>>>>>> 9.6, so perhaps you should try upgrading to 9.5 first.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 19, 2016 at 8:58 AM, Lars Helge Øverland <lars@xxxxxxxxx>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Neeraj,
>>>>>>
>>>>>>
>>>>>>
>>>>>> what usually helps to improve runtime is to improve/increase:
>>>>>>
>>>>>>
>>>>>>
>>>>>> - ssd (read and write speed)
>>>>>>
>>>>>> - number of CPUs
>>>>>>
>>>>>> - using latest postgresql (9.6 claims to have even better indexing
>>>>>> performance
>>>>>> <https://www.postgresql.org/docs/9.6/static/release-9-6.html> than
>>>>>> 9.5)
>>>>>>
>>>>>> - tuning
>>>>>> <https://dhis2.github.io/dhis2-docs/master/en/implementer/html/install_server_setup.html#install_postgresql_performance_tuning>
>>>>>> of postgresql
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> regards,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Lars
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Lars Helge Øverland
>>>>>>
>>>>>> Lead developer, DHIS 2
>>>>>>
>>>>>> University of Oslo
>>>>>>
>>>>>> Skype: larshelgeoverland
>>>>>>
>>>>>> lars@xxxxxxxxx
>>>>>>
>>>>>> http://www.dhis2.org <https://www.dhis2.org/>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Knut Staring
>>>>>>
>>>>>> Dept. of Informatics, University of Oslo
>>>>>>
>>>>>> Norway: +4791880522
>>>>>>
>>>>>> Skype: knutstar
>>>>>>
>>>>>> http://dhis2.org
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Best Regards,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Brajesh Murari,
>>>>>>
>>>>>> Postgraduate, Department of Computer Science and Engineering,
>>>>>>
>>>>>> Chaudhary Devi Lal University, Sirsa,
>>>>>>
>>>>>> India.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The three basic dimensions of human development: a long and healthy
>>>>>> life, access to knowledge, and a decent standard of living.
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> *******************************************
>>>>>
>>>>> Calle Hedberg
>>>>>
>>>>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>>>>
>>>>> Tel/fax (home): +27-21-685-6472
>>>>>
>>>>> Cell: +27-82-853-5352
>>>>>
>>>>> Iridium SatPhone: +8816-315-19119
>>>>>
>>>>> Email: calle.hedberg@xxxxxxxxx
>>>>>
>>>>> Skype: calle_hedberg
>>>>>
>>>>> *******************************************
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Neeraj Gupta
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>> Brajesh Murari,
>>> Postgraduate, Dept of CSE,
>>> Chaudhary Devi Lal University, Sirsa,
>>> India.
>>>
>>> The three basic dimensions of human development: a long and healthy
>>> life, access to knowledge, and a decent standard of living.
>>>
>>
>>
>>
>> --
>>
>> *******************************************
>>
>> Calle Hedberg
>>
>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>
>> Tel/fax (home): +27-21-685-6472
>>
>> Cell: +27-82-853-5352
>>
>> Iridium SatPhone: +8816-315-19119
>>
>> Email: calle.hedberg@xxxxxxxxx
>>
>> Skype: calle_hedberg
>>
>> *******************************************
>>
>>
>
>
> --
> Thanks,
> Neeraj Gupta
>



-- 

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@xxxxxxxxx

Skype: calle_hedberg

*******************************************

Follow ups

References