← Back to team overview

dhis2-users team mailing list archive

Re: [Dhis2-devs] 25 hours in completing Analytic

 

Hi there,

the analytics table generation is bound by db indexing speed / disk write
speed. It scales almost linearly with the number of CPU cores available and
disk write speed.

regards,

Lars



On Mon, Oct 24, 2016 at 7:33 AM, Neeraj Gupta <neeraj.hisp@xxxxxxxxx> wrote:

> Hi Calle,
>
> Thanks for all this.
>
> Yes, we will have a test server available with us by first week of
> November, I can start test scenarios suggested by you after that.
>
> Thanks,
> Neeraj
>
> On Mon, Oct 24, 2016 at 3:50 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
> wrote:
>
>> Neeraj,
>>
>> Thanks for that, it is useful info.
>>
>> The instance has a very skewed balance between data elements/catcombos
>> and indicators, but what else is new - outside of South Africa, most
>> countries have the same: they collect a huge number or data elements but
>> turn very few of them into indicators (SA collect very very few data
>> elements that are NOT part of any indicator). But that is a separate
>> discussion.
>>
>> I would NOT expect the number of data elements & catcombos + the number
>> of orgunits to have a significant impact on analytics performance, but that
>> needs to be verified.
>>
>> I WOULD expect the number of indicators to have a significant impact, so
>> if you doubled or quadrupled the number of indicators I would expect a
>> significant impact on analytics time
>>
>> Using your and my numbers as a baseline, I would envision the following
>> tests to be revealing:
>>
>> Baseline system 1: SSD, 4-core CPU: 100 mill values, 400 DEs, 250
>> indicators, 30,000 OUs, few catcombos -> 1 hour for analytics processing
>>
>> Baseline system 2: SSD, 12-core CPU: 500 mill values, 3,500 DEs, 15,000
>> OUs, 25 catcombos -> ~10 hours for analytics processing
>>
>> Test 1: Using system 1, reduce OUs to 15,000 - run analytics (hypothesis:
>> no significant difference)
>> Test 2: Using system 1, reduce indicators to 125 - run analytics, then to
>> 0 - run analytics (hypothesis: analytics reduced to 45 min)
>> Test 3: Using system 1, increase datavalues from 100mill to 500 mill by
>> introducing additional attributecombos (easy) - run analytics (hypothesis:
>> ~15 hours for analytics processing, with 4-cores instead of 12-cores)
>> Test 4: Using system 1, increase datavalues to 1 billion - run analytics
>> (hypothesis: 35 hours for analytics processing).
>>
>> Neeraj, I don't know if you have a similar sandbox server available for
>> testing, but if you do:
>>
>> Test 5: Using system 2, increase number of indicators to 600 (just export
>> the 200 you have, modify the names and uids a bit, and re-import - for the
>> purpose of this test, it does not matter that many of the indicators have
>> identical formulas). Hypothesis: Analytics time up at least 20%
>> Test 6: using system 2 (with 200 indicators), shift half of the 500 mill
>> data values forward or backward in time resulting in twice the number of
>> analytics tables at half the size. (hypothesis: slight reduction in
>> processing time)
>> Test 7: Using system 2 (with 200 indicators), increase datavalues to 1
>> billion using a new attributecombo - run analytics (hypothesis: analytics
>> time up to 25 hours again).
>>
>> As far as I've seen, the bulk of analytics processing time is taken up by
>> indexing - which I would expect to follow a moderately quadratic curve in
>> terms of analytics table sizes.
>>
>> Best regards
>> Calle
>>
>> On 24 October 2016 at 10:39, Neeraj Gupta <neeraj.hisp@xxxxxxxxx> wrote:
>>
>>> Calle,
>>>
>>> We have around 500 million record in database with 3666 data elements
>>> having 26 category combinations and 201 indicators and there are 14398
>>> organisation units.
>>>
>>> Thanks,
>>> Neeraj
>>>
>>> On Mon, Oct 24, 2016 at 1:50 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
>>> wrote:
>>>
>>>> Neeraj,
>>>>
>>>> You never stated the number of records you have in the datavalue table
>>>> - what is it?
>>>>
>>>> In the same context: anybody have a rough idea of how many datavalue
>>>> records there are in the global DATIM database - which I think currently
>>>> might be the largest DHIS2 instance around?
>>>>
>>>> Given our own recent work on performance + what Neeraj has reported,
>>>> I've been thinking of creating one test instance with let us say 500 mill
>>>> datavalue records and another with let us say 1 billion, then use them to
>>>> identify key bottlenecks in various processes AND use them to ensure that
>>>> DHIS2 analytics performance is as linear as possible in terms of database
>>>> size. Postgresql has introduced a number of new indexing algorithms in
>>>> recent versions, and I'm not sure if DHIS2 is taking full advantage of them.
>>>>
>>>> Best regards
>>>> Calle
>>>>
>>>> On 24 October 2016 at 07:53, Brajesh Murari <brajesh.murari@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Congratulation Neeraj and team ...it much appreciated
>>>>>
>>>>> On Mon, Oct 24, 2016 at 11:08 AM, Neeraj Gupta <neeraj.hisp@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> Dear Team, Thanks for all your suggestions.
>>>>>>
>>>>>> Now the time of analytic is reduced to 10 hours 41 minutes.
>>>>>>
>>>>>> We tried to VACUUM as Sam suggested but it didn't help then we
>>>>>> upgraded postgres from 9.4 to 9.5.4 and as Calle and Bob suggested we made
>>>>>> some changes in configuration file of postgres and it reduced the time. But
>>>>>> the database size is still same.
>>>>>>
>>>>>> Thanks for all your help!
>>>>>>
>>>>>> Thanks,
>>>>>> Neeraj
>>>>>>
>>>>>> On Wed, Oct 19, 2016 at 6:46 PM, Calle Hedberg <
>>>>>> calle.hedberg@xxxxxxxxx> wrote:
>>>>>>
>>>>>>> Neeraj,
>>>>>>>
>>>>>>> It's always an element of uncertainty linked to database sizes - ref
>>>>>>> Sam's post over. So indicating the number of records you have in the
>>>>>>> datavalue table & key meta-data tables would be useful + indicating whether
>>>>>>> you are running other instances on the same server. Some comments - I've
>>>>>>> been doing a lot of similar optimising work recently:
>>>>>>>
>>>>>>> 1. Upgrading to 9.5.4 is strongly recommended (and don't use 9.6
>>>>>>> before the worst bugs are fixed and it has stabilised).
>>>>>>>
>>>>>>> 2. Carefully check your postgres.conf against the recommended
>>>>>>> settings. The guide is a bit superficial in the sense that it has
>>>>>>> recommended "fixed" values only and no explanations around ranges below or
>>>>>>> above those, but you can experiment a bit yourself (e.g. the recommended
>>>>>>> "max_connections = 200" might not be sufficient for a really large system
>>>>>>> like what you have.
>>>>>>>
>>>>>>> 3. If your server is running that single instance only, then 48GB or
>>>>>>> RAM should be sufficient. Our servers are all having 128GB RAM so we
>>>>>>> experimented quite a bit earlier this year with giving a DHIS2 instance
>>>>>>> large amounts or RAM (up to 60-70gb), with negligible impact on
>>>>>>> performance. According to Lars, the DHIS2 cannot really utilize more than
>>>>>>> around 16gb RAM (at least that is how I understood his communication at the
>>>>>>> time). So 48GB should be sufficient for a single instance.
>>>>>>>
>>>>>>> 4. I've been doing performance optimizing recently on an instance
>>>>>>> with
>>>>>>> - 4-core server with 2x 512gb ssd, 12gb allocated to DHIS2
>>>>>>> - 31,000 Orgunits
>>>>>>> - 420 data elements
>>>>>>> - 250 indicators
>>>>>>> - around 100 mill datavalue records
>>>>>>> - total size around 140gb with analytics tables.
>>>>>>>
>>>>>>> So the size is only 25% of your 500GB, but RUNNING ANALYTICS ON THAT
>>>>>>> DATABASE INSTANCE IS TAKING JUST OVER 1 HOUR. Fundamentally, if the
>>>>>>> analytics engine is designed well, I would expect a nearly linear
>>>>>>> relationship between database size and the time analytics takes to run. So
>>>>>>> running analytics on your database on our server should in theory take 4-5
>>>>>>> hours.
>>>>>>>
>>>>>>> We are obviously comparing oranges and nectarines here, in the sense
>>>>>>> that there might be other aspects of our server and database that is
>>>>>>> different from yours (type of CPU, no of OUs, no of DEs/Indicators, whether
>>>>>>> your instance have lots of tracker data, etc etc). I have not seen any
>>>>>>> scientific/quantified comparative performance values related to specific
>>>>>>> parameters like number of CPUs and/or number of cores, but 12 cores SHOULD
>>>>>>> improve analytics performance quite a bit - assuming around 30% then it
>>>>>>> means running analytics on your database/server should take around 3
>>>>>>> hours......
>>>>>>>
>>>>>>> I tried getting comparative, quantitative data on various
>>>>>>> configurations of hardware and software (e.g. some users prefer CentOS,
>>>>>>> others Ubuntu) during the academy in August, but did not get much - it
>>>>>>> seems most users/providers have found a setup that works for them for now
>>>>>>> and nobody is doing any systematic performance testing (some of the
>>>>>>> international NGOs/companies using DHIS2 might have, but as with internally
>>>>>>> developed apps they are not that keen on sharing). So it would be highly
>>>>>>> appreciated if you would post the results on analytics time with every
>>>>>>> upgrade / tweak you do - starting with the upgrade to Pg 9.5.4
>>>>>>>
>>>>>>> Best regards
>>>>>>> Calle
>>>>>>>
>>>>>>> On 19 October 2016 at 13:28, Sam Johnson <samuel.johnson@xxxxxxxxxx>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Neeraj,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Using VACUUM and ANALYZE*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Like Brajesh, my background is MySQL, and one database admin task
>>>>>>>> that is often overlooked in MySQL is OPTIMIZE TABLEs.  This reclaims unused
>>>>>>>> space (we’ve had 100Gb databases files drop to half their size) and
>>>>>>>> refreshes index statistics (if the shape of your data has changed over
>>>>>>>> time, this can make indices run faster).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I’m new to PostgreSQL, but the core principles are the same, and a
>>>>>>>> quick bit of Googling shows that the equivalents in PostgreSQL are the
>>>>>>>> VACUUM and ANALYZE commands.  If your database isn’t set to automatically
>>>>>>>> do VACUUMs (the default DHIS2 postgres config doesn’t seem to be), you
>>>>>>>> might want to try VACUUM *FULL*, which will literally rewrite all
>>>>>>>> of your database tables and indices into smaller, more efficient files
>>>>>>>> (note, however, that on a 500Gb database this could take a *looong*
>>>>>>>> time – perhaps test on a backup first?).  The following forum post is a
>>>>>>>> really nice, plain-English explanation of what VACUUM does:
>>>>>>>>
>>>>>>>> http://dba.stackexchange.com/questions/126258/what-is-table-
>>>>>>>> bloating-in-databases
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> As I mentioned, my background is MySQL rather than Postgres, so
>>>>>>>> someone with more specific Postgres experience might like to also chime in
>>>>>>>> here.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers, Sam.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From: *Dhis2-users <dhis2-users-bounces+samuel.johnson=
>>>>>>>> qebo.co.uk@xxxxxxxxxxxxxxxxxxx> on behalf of Brajesh Murari <
>>>>>>>> brajesh.murari@xxxxxxxxx>
>>>>>>>> *Date: *Wednesday, 19 October 2016 at 08:28
>>>>>>>> *To: *Knut Staring <knutst@xxxxxxxxx>
>>>>>>>> *Cc: *DHIS 2 Users list <dhis2-users@xxxxxxxxxxxxxxxxxxx>, DHIS2
>>>>>>>> Developers <dhis2-devs@xxxxxxxxxxxxxxxxxxx>
>>>>>>>> *Subject: *Re: [Dhis2-users] [Dhis2-devs] 25 hours in completing
>>>>>>>> Analytic
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Dear Neeraj,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The physical database size doesn't matter much, even the number of
>>>>>>>> records don't matter. In my experience the biggest problem that one can
>>>>>>>> going to run in to is not size, but the number of queries you can handle at
>>>>>>>> a time instance specially during analytic functionality execution. Most
>>>>>>>> probably you should going to have to move to a master/slave configuration
>>>>>>>> of your database, so that the read queries can run against the slaves and
>>>>>>>> the write queries run against the master. However, if you and your database
>>>>>>>> management team are not ready for this than, you can tweak your indexes for
>>>>>>>> the queries you are running to speed up the response times. Also there is a
>>>>>>>> lot of tweaking you can do to the network stack and kernel in Linux where
>>>>>>>> MySQL Server has been installed that will help.Perhaps, I would focus first
>>>>>>>> on your indexes, then have a server admin look at your OS, and if all that
>>>>>>>> doesn't help it might be time to implement a master/slave configuration.
>>>>>>>> The most important scalability factor is RAM. If the indexes of your tables
>>>>>>>> fit into memory and your queries are highly optimized in analytic
>>>>>>>> functionality, you can serve a reasonable amount of requests with a average
>>>>>>>> machine. The number of records do matter, depending of how your tables look
>>>>>>>> like. It's a difference to have a lot of varchar fields or only a couple of
>>>>>>>> ints or longs. The physical size of the database matters as well,  think of
>>>>>>>> backups, for instance. Depending on your engine, your physical db files on
>>>>>>>> grow, but don't shrink, for instance with innodb. So deleting a lot of
>>>>>>>> rows, doesn't help to shrink your physical files. Thus the database size
>>>>>>>> does matter. If you have more than one table with more than a million
>>>>>>>> records, then performance starts indeed to degrade. Indexig is one of the
>>>>>>>> important stand need to take care, If you hit one million records you will
>>>>>>>> get performance problems, if the indices are not set right (for example no
>>>>>>>> indices for fields in "WHERE statements" or "ON conditions" in joins). If
>>>>>>>> you hit 10 million records, you will start to get performance problems even
>>>>>>>> if you have all your indices right. Hardware upgrades - adding more memory
>>>>>>>> and more processor power, especially memory - often help to reduce the most
>>>>>>>> severe problems by increasing the performance again, at least to a certain
>>>>>>>> degree.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 19, 2016 at 12:35 PM, Knut Staring <knutst@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Just a heads-up that there seems to be a JDBC issue with Postgres
>>>>>>>> 9.6, so perhaps you should try upgrading to 9.5 first.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 19, 2016 at 8:58 AM, Lars Helge Øverland <
>>>>>>>> lars@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Neeraj,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> what usually helps to improve runtime is to improve/increase:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> - ssd (read and write speed)
>>>>>>>>
>>>>>>>> - number of CPUs
>>>>>>>>
>>>>>>>> - using latest postgresql (9.6 claims to have even better indexing
>>>>>>>> performance
>>>>>>>> <https://www.postgresql.org/docs/9.6/static/release-9-6.html> than
>>>>>>>> 9.5)
>>>>>>>>
>>>>>>>> - tuning
>>>>>>>> <https://dhis2.github.io/dhis2-docs/master/en/implementer/html/install_server_setup.html#install_postgresql_performance_tuning>
>>>>>>>> of postgresql
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> regards,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Lars
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Lars Helge Øverland
>>>>>>>>
>>>>>>>> Lead developer, DHIS 2
>>>>>>>>
>>>>>>>> University of Oslo
>>>>>>>>
>>>>>>>> Skype: larshelgeoverland
>>>>>>>>
>>>>>>>> lars@xxxxxxxxx
>>>>>>>>
>>>>>>>> http://www.dhis2.org <https://www.dhis2.org/>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Knut Staring
>>>>>>>>
>>>>>>>> Dept. of Informatics, University of Oslo
>>>>>>>>
>>>>>>>> Norway: +4791880522
>>>>>>>>
>>>>>>>> Skype: knutstar
>>>>>>>>
>>>>>>>> http://dhis2.org
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Brajesh Murari,
>>>>>>>>
>>>>>>>> Postgraduate, Department of Computer Science and Engineering,
>>>>>>>>
>>>>>>>> Chaudhary Devi Lal University, Sirsa,
>>>>>>>>
>>>>>>>> India.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The three basic dimensions of human development: a long and healthy
>>>>>>>> life, access to knowledge, and a decent standard of living.
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> *******************************************
>>>>>>>
>>>>>>> Calle Hedberg
>>>>>>>
>>>>>>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>>>>>>
>>>>>>> Tel/fax (home): +27-21-685-6472
>>>>>>>
>>>>>>> Cell: +27-82-853-5352
>>>>>>>
>>>>>>> Iridium SatPhone: +8816-315-19119
>>>>>>>
>>>>>>> Email: calle.hedberg@xxxxxxxxx
>>>>>>>
>>>>>>> Skype: calle_hedberg
>>>>>>>
>>>>>>> *******************************************
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Neeraj Gupta
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>>
>>>>> Brajesh Murari,
>>>>> Postgraduate, Dept of CSE,
>>>>> Chaudhary Devi Lal University, Sirsa,
>>>>> India.
>>>>>
>>>>> The three basic dimensions of human development: a long and healthy
>>>>> life, access to knowledge, and a decent standard of living.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> *******************************************
>>>>
>>>> Calle Hedberg
>>>>
>>>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>>>
>>>> Tel/fax (home): +27-21-685-6472
>>>>
>>>> Cell: +27-82-853-5352
>>>>
>>>> Iridium SatPhone: +8816-315-19119
>>>>
>>>> Email: calle.hedberg@xxxxxxxxx
>>>>
>>>> Skype: calle_hedberg
>>>>
>>>> *******************************************
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Neeraj Gupta
>>>
>>
>>
>>
>> --
>>
>> *******************************************
>>
>> Calle Hedberg
>>
>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>
>> Tel/fax (home): +27-21-685-6472
>>
>> Cell: +27-82-853-5352
>>
>> Iridium SatPhone: +8816-315-19119
>>
>> Email: calle.hedberg@xxxxxxxxx
>>
>> Skype: calle_hedberg
>>
>> *******************************************
>>
>>
>
>
> --
> Thanks,
> Neeraj Gupta
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
lars@xxxxxxxxx
http://www.dhis2.org <https://www.dhis2.org/>

References