dhis2-devs team mailing list archive
-
dhis2-devs team
-
Mailing list archive
-
Message #47379
Re: [Dhis2-users] 25 hours in completing Analytic
Hi Calle,
Thanks for all this.
Yes, we will have a test server available with us by first week of
November, I can start test scenarios suggested by you after that.
Thanks,
Neeraj
On Mon, Oct 24, 2016 at 3:50 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
wrote:
> Neeraj,
>
> Thanks for that, it is useful info.
>
> The instance has a very skewed balance between data elements/catcombos and
> indicators, but what else is new - outside of South Africa, most countries
> have the same: they collect a huge number or data elements but turn very
> few of them into indicators (SA collect very very few data elements that
> are NOT part of any indicator). But that is a separate discussion.
>
> I would NOT expect the number of data elements & catcombos + the number of
> orgunits to have a significant impact on analytics performance, but that
> needs to be verified.
>
> I WOULD expect the number of indicators to have a significant impact, so
> if you doubled or quadrupled the number of indicators I would expect a
> significant impact on analytics time
>
> Using your and my numbers as a baseline, I would envision the following
> tests to be revealing:
>
> Baseline system 1: SSD, 4-core CPU: 100 mill values, 400 DEs, 250
> indicators, 30,000 OUs, few catcombos -> 1 hour for analytics processing
>
> Baseline system 2: SSD, 12-core CPU: 500 mill values, 3,500 DEs, 15,000
> OUs, 25 catcombos -> ~10 hours for analytics processing
>
> Test 1: Using system 1, reduce OUs to 15,000 - run analytics (hypothesis:
> no significant difference)
> Test 2: Using system 1, reduce indicators to 125 - run analytics, then to
> 0 - run analytics (hypothesis: analytics reduced to 45 min)
> Test 3: Using system 1, increase datavalues from 100mill to 500 mill by
> introducing additional attributecombos (easy) - run analytics (hypothesis:
> ~15 hours for analytics processing, with 4-cores instead of 12-cores)
> Test 4: Using system 1, increase datavalues to 1 billion - run analytics
> (hypothesis: 35 hours for analytics processing).
>
> Neeraj, I don't know if you have a similar sandbox server available for
> testing, but if you do:
>
> Test 5: Using system 2, increase number of indicators to 600 (just export
> the 200 you have, modify the names and uids a bit, and re-import - for the
> purpose of this test, it does not matter that many of the indicators have
> identical formulas). Hypothesis: Analytics time up at least 20%
> Test 6: using system 2 (with 200 indicators), shift half of the 500 mill
> data values forward or backward in time resulting in twice the number of
> analytics tables at half the size. (hypothesis: slight reduction in
> processing time)
> Test 7: Using system 2 (with 200 indicators), increase datavalues to 1
> billion using a new attributecombo - run analytics (hypothesis: analytics
> time up to 25 hours again).
>
> As far as I've seen, the bulk of analytics processing time is taken up by
> indexing - which I would expect to follow a moderately quadratic curve in
> terms of analytics table sizes.
>
> Best regards
> Calle
>
> On 24 October 2016 at 10:39, Neeraj Gupta <neeraj.hisp@xxxxxxxxx> wrote:
>
>> Calle,
>>
>> We have around 500 million record in database with 3666 data elements
>> having 26 category combinations and 201 indicators and there are 14398
>> organisation units.
>>
>> Thanks,
>> Neeraj
>>
>> On Mon, Oct 24, 2016 at 1:50 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
>> wrote:
>>
>>> Neeraj,
>>>
>>> You never stated the number of records you have in the datavalue table -
>>> what is it?
>>>
>>> In the same context: anybody have a rough idea of how many datavalue
>>> records there are in the global DATIM database - which I think currently
>>> might be the largest DHIS2 instance around?
>>>
>>> Given our own recent work on performance + what Neeraj has reported,
>>> I've been thinking of creating one test instance with let us say 500 mill
>>> datavalue records and another with let us say 1 billion, then use them to
>>> identify key bottlenecks in various processes AND use them to ensure that
>>> DHIS2 analytics performance is as linear as possible in terms of database
>>> size. Postgresql has introduced a number of new indexing algorithms in
>>> recent versions, and I'm not sure if DHIS2 is taking full advantage of them.
>>>
>>> Best regards
>>> Calle
>>>
>>> On 24 October 2016 at 07:53, Brajesh Murari <brajesh.murari@xxxxxxxxx>
>>> wrote:
>>>
>>>> Congratulation Neeraj and team ...it much appreciated
>>>>
>>>> On Mon, Oct 24, 2016 at 11:08 AM, Neeraj Gupta <neeraj.hisp@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Dear Team, Thanks for all your suggestions.
>>>>>
>>>>> Now the time of analytic is reduced to 10 hours 41 minutes.
>>>>>
>>>>> We tried to VACUUM as Sam suggested but it didn't help then we
>>>>> upgraded postgres from 9.4 to 9.5.4 and as Calle and Bob suggested we made
>>>>> some changes in configuration file of postgres and it reduced the time. But
>>>>> the database size is still same.
>>>>>
>>>>> Thanks for all your help!
>>>>>
>>>>> Thanks,
>>>>> Neeraj
>>>>>
>>>>> On Wed, Oct 19, 2016 at 6:46 PM, Calle Hedberg <
>>>>> calle.hedberg@xxxxxxxxx> wrote:
>>>>>
>>>>>> Neeraj,
>>>>>>
>>>>>> It's always an element of uncertainty linked to database sizes - ref
>>>>>> Sam's post over. So indicating the number of records you have in the
>>>>>> datavalue table & key meta-data tables would be useful + indicating whether
>>>>>> you are running other instances on the same server. Some comments - I've
>>>>>> been doing a lot of similar optimising work recently:
>>>>>>
>>>>>> 1. Upgrading to 9.5.4 is strongly recommended (and don't use 9.6
>>>>>> before the worst bugs are fixed and it has stabilised).
>>>>>>
>>>>>> 2. Carefully check your postgres.conf against the recommended
>>>>>> settings. The guide is a bit superficial in the sense that it has
>>>>>> recommended "fixed" values only and no explanations around ranges below or
>>>>>> above those, but you can experiment a bit yourself (e.g. the recommended
>>>>>> "max_connections = 200" might not be sufficient for a really large system
>>>>>> like what you have.
>>>>>>
>>>>>> 3. If your server is running that single instance only, then 48GB or
>>>>>> RAM should be sufficient. Our servers are all having 128GB RAM so we
>>>>>> experimented quite a bit earlier this year with giving a DHIS2 instance
>>>>>> large amounts or RAM (up to 60-70gb), with negligible impact on
>>>>>> performance. According to Lars, the DHIS2 cannot really utilize more than
>>>>>> around 16gb RAM (at least that is how I understood his communication at the
>>>>>> time). So 48GB should be sufficient for a single instance.
>>>>>>
>>>>>> 4. I've been doing performance optimizing recently on an instance
>>>>>> with
>>>>>> - 4-core server with 2x 512gb ssd, 12gb allocated to DHIS2
>>>>>> - 31,000 Orgunits
>>>>>> - 420 data elements
>>>>>> - 250 indicators
>>>>>> - around 100 mill datavalue records
>>>>>> - total size around 140gb with analytics tables.
>>>>>>
>>>>>> So the size is only 25% of your 500GB, but RUNNING ANALYTICS ON THAT
>>>>>> DATABASE INSTANCE IS TAKING JUST OVER 1 HOUR. Fundamentally, if the
>>>>>> analytics engine is designed well, I would expect a nearly linear
>>>>>> relationship between database size and the time analytics takes to run. So
>>>>>> running analytics on your database on our server should in theory take 4-5
>>>>>> hours.
>>>>>>
>>>>>> We are obviously comparing oranges and nectarines here, in the sense
>>>>>> that there might be other aspects of our server and database that is
>>>>>> different from yours (type of CPU, no of OUs, no of DEs/Indicators, whether
>>>>>> your instance have lots of tracker data, etc etc). I have not seen any
>>>>>> scientific/quantified comparative performance values related to specific
>>>>>> parameters like number of CPUs and/or number of cores, but 12 cores SHOULD
>>>>>> improve analytics performance quite a bit - assuming around 30% then it
>>>>>> means running analytics on your database/server should take around 3
>>>>>> hours......
>>>>>>
>>>>>> I tried getting comparative, quantitative data on various
>>>>>> configurations of hardware and software (e.g. some users prefer CentOS,
>>>>>> others Ubuntu) during the academy in August, but did not get much - it
>>>>>> seems most users/providers have found a setup that works for them for now
>>>>>> and nobody is doing any systematic performance testing (some of the
>>>>>> international NGOs/companies using DHIS2 might have, but as with internally
>>>>>> developed apps they are not that keen on sharing). So it would be highly
>>>>>> appreciated if you would post the results on analytics time with every
>>>>>> upgrade / tweak you do - starting with the upgrade to Pg 9.5.4
>>>>>>
>>>>>> Best regards
>>>>>> Calle
>>>>>>
>>>>>> On 19 October 2016 at 13:28, Sam Johnson <samuel.johnson@xxxxxxxxxx>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Neeraj,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Using VACUUM and ANALYZE*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Like Brajesh, my background is MySQL, and one database admin task
>>>>>>> that is often overlooked in MySQL is OPTIMIZE TABLEs. This reclaims unused
>>>>>>> space (we’ve had 100Gb databases files drop to half their size) and
>>>>>>> refreshes index statistics (if the shape of your data has changed over
>>>>>>> time, this can make indices run faster).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I’m new to PostgreSQL, but the core principles are the same, and a
>>>>>>> quick bit of Googling shows that the equivalents in PostgreSQL are the
>>>>>>> VACUUM and ANALYZE commands. If your database isn’t set to automatically
>>>>>>> do VACUUMs (the default DHIS2 postgres config doesn’t seem to be), you
>>>>>>> might want to try VACUUM *FULL*, which will literally rewrite all
>>>>>>> of your database tables and indices into smaller, more efficient files
>>>>>>> (note, however, that on a 500Gb database this could take a *looong*
>>>>>>> time – perhaps test on a backup first?). The following forum post is a
>>>>>>> really nice, plain-English explanation of what VACUUM does:
>>>>>>>
>>>>>>> http://dba.stackexchange.com/questions/126258/what-is-table-
>>>>>>> bloating-in-databases
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> As I mentioned, my background is MySQL rather than Postgres, so
>>>>>>> someone with more specific Postgres experience might like to also chime in
>>>>>>> here.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers, Sam.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From: *Dhis2-users <dhis2-users-bounces+samuel.johnson=
>>>>>>> qebo.co.uk@xxxxxxxxxxxxxxxxxxx> on behalf of Brajesh Murari <
>>>>>>> brajesh.murari@xxxxxxxxx>
>>>>>>> *Date: *Wednesday, 19 October 2016 at 08:28
>>>>>>> *To: *Knut Staring <knutst@xxxxxxxxx>
>>>>>>> *Cc: *DHIS 2 Users list <dhis2-users@xxxxxxxxxxxxxxxxxxx>, DHIS2
>>>>>>> Developers <dhis2-devs@xxxxxxxxxxxxxxxxxxx>
>>>>>>> *Subject: *Re: [Dhis2-users] [Dhis2-devs] 25 hours in completing
>>>>>>> Analytic
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Dear Neeraj,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The physical database size doesn't matter much, even the number of
>>>>>>> records don't matter. In my experience the biggest problem that one can
>>>>>>> going to run in to is not size, but the number of queries you can handle at
>>>>>>> a time instance specially during analytic functionality execution. Most
>>>>>>> probably you should going to have to move to a master/slave configuration
>>>>>>> of your database, so that the read queries can run against the slaves and
>>>>>>> the write queries run against the master. However, if you and your database
>>>>>>> management team are not ready for this than, you can tweak your indexes for
>>>>>>> the queries you are running to speed up the response times. Also there is a
>>>>>>> lot of tweaking you can do to the network stack and kernel in Linux where
>>>>>>> MySQL Server has been installed that will help.Perhaps, I would focus first
>>>>>>> on your indexes, then have a server admin look at your OS, and if all that
>>>>>>> doesn't help it might be time to implement a master/slave configuration.
>>>>>>> The most important scalability factor is RAM. If the indexes of your tables
>>>>>>> fit into memory and your queries are highly optimized in analytic
>>>>>>> functionality, you can serve a reasonable amount of requests with a average
>>>>>>> machine. The number of records do matter, depending of how your tables look
>>>>>>> like. It's a difference to have a lot of varchar fields or only a couple of
>>>>>>> ints or longs. The physical size of the database matters as well, think of
>>>>>>> backups, for instance. Depending on your engine, your physical db files on
>>>>>>> grow, but don't shrink, for instance with innodb. So deleting a lot of
>>>>>>> rows, doesn't help to shrink your physical files. Thus the database size
>>>>>>> does matter. If you have more than one table with more than a million
>>>>>>> records, then performance starts indeed to degrade. Indexig is one of the
>>>>>>> important stand need to take care, If you hit one million records you will
>>>>>>> get performance problems, if the indices are not set right (for example no
>>>>>>> indices for fields in "WHERE statements" or "ON conditions" in joins). If
>>>>>>> you hit 10 million records, you will start to get performance problems even
>>>>>>> if you have all your indices right. Hardware upgrades - adding more memory
>>>>>>> and more processor power, especially memory - often help to reduce the most
>>>>>>> severe problems by increasing the performance again, at least to a certain
>>>>>>> degree.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 19, 2016 at 12:35 PM, Knut Staring <knutst@xxxxxxxxx>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Just a heads-up that there seems to be a JDBC issue with Postgres
>>>>>>> 9.6, so perhaps you should try upgrading to 9.5 first.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 19, 2016 at 8:58 AM, Lars Helge Øverland <lars@xxxxxxxxx>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Neeraj,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> what usually helps to improve runtime is to improve/increase:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> - ssd (read and write speed)
>>>>>>>
>>>>>>> - number of CPUs
>>>>>>>
>>>>>>> - using latest postgresql (9.6 claims to have even better indexing
>>>>>>> performance
>>>>>>> <https://www.postgresql.org/docs/9.6/static/release-9-6.html> than
>>>>>>> 9.5)
>>>>>>>
>>>>>>> - tuning
>>>>>>> <https://dhis2.github.io/dhis2-docs/master/en/implementer/html/install_server_setup.html#install_postgresql_performance_tuning>
>>>>>>> of postgresql
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> regards,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Lars
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Lars Helge Øverland
>>>>>>>
>>>>>>> Lead developer, DHIS 2
>>>>>>>
>>>>>>> University of Oslo
>>>>>>>
>>>>>>> Skype: larshelgeoverland
>>>>>>>
>>>>>>> lars@xxxxxxxxx
>>>>>>>
>>>>>>> http://www.dhis2.org <https://www.dhis2.org/>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Knut Staring
>>>>>>>
>>>>>>> Dept. of Informatics, University of Oslo
>>>>>>>
>>>>>>> Norway: +4791880522
>>>>>>>
>>>>>>> Skype: knutstar
>>>>>>>
>>>>>>> http://dhis2.org
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>>> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Best Regards,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Brajesh Murari,
>>>>>>>
>>>>>>> Postgraduate, Department of Computer Science and Engineering,
>>>>>>>
>>>>>>> Chaudhary Devi Lal University, Sirsa,
>>>>>>>
>>>>>>> India.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The three basic dimensions of human development: a long and healthy
>>>>>>> life, access to knowledge, and a decent standard of living.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>>> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> *******************************************
>>>>>>
>>>>>> Calle Hedberg
>>>>>>
>>>>>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>>>>>
>>>>>> Tel/fax (home): +27-21-685-6472
>>>>>>
>>>>>> Cell: +27-82-853-5352
>>>>>>
>>>>>> Iridium SatPhone: +8816-315-19119
>>>>>>
>>>>>> Email: calle.hedberg@xxxxxxxxx
>>>>>>
>>>>>> Skype: calle_hedberg
>>>>>>
>>>>>> *******************************************
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Neeraj Gupta
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>>
>>>> Brajesh Murari,
>>>> Postgraduate, Dept of CSE,
>>>> Chaudhary Devi Lal University, Sirsa,
>>>> India.
>>>>
>>>> The three basic dimensions of human development: a long and healthy
>>>> life, access to knowledge, and a decent standard of living.
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> *******************************************
>>>
>>> Calle Hedberg
>>>
>>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>>
>>> Tel/fax (home): +27-21-685-6472
>>>
>>> Cell: +27-82-853-5352
>>>
>>> Iridium SatPhone: +8816-315-19119
>>>
>>> Email: calle.hedberg@xxxxxxxxx
>>>
>>> Skype: calle_hedberg
>>>
>>> *******************************************
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Neeraj Gupta
>>
>
>
>
> --
>
> *******************************************
>
> Calle Hedberg
>
> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>
> Tel/fax (home): +27-21-685-6472
>
> Cell: +27-82-853-5352
>
> Iridium SatPhone: +8816-315-19119
>
> Email: calle.hedberg@xxxxxxxxx
>
> Skype: calle_hedberg
>
> *******************************************
>
>
--
Thanks,
Neeraj Gupta
Follow ups
References
-
25 hours in completing Analytic
From: Neeraj Gupta, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Lars Helge Øverland, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Lars Helge Øverland, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Knut Staring, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Brajesh Murari, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Sam Johnson, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Calle Hedberg, 2016-10-19
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Neeraj Gupta, 2016-10-24
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Brajesh Murari, 2016-10-24
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Calle Hedberg, 2016-10-24
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Neeraj Gupta, 2016-10-24
-
Re: [Dhis2-users] 25 hours in completing Analytic
From: Calle Hedberg, 2016-10-24