← Back to team overview

dhis2-devs team mailing list archive

Re: DHIS2 - Indicator calculation over dimensions

 

Thanks Jason – Was shying away from a report solution to a more permanent one. 

Extra operators or tools to extend the functionality of indicator calculations will improve what we currently have – A blueprint will probably be in order.

 

………………………………………

Regards,

Dapo Adejumo

+2348033683677

Skype : dapojorge

 

From: Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx] 
Sent: 5 October, 2014 2:27 PM
To: Dapo Adejumo
Cc: Robin Martens; dhis2-users@xxxxxxxxxxxxxxxxxxx; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Dapo,

 

I think it is related in that it shows how we cannot calculate more "complicated" indicators. In this case, it is really a case of having some operator like "IF" in spreadhseet programs. E.g...

 

IF(logical_test ,true_value,false_value)

 

In your case, it would be something like IF (BCG coverage < 50%, 1, 0 ) 

 

This need for this operator has come up a few times, and it would be good to see how this could be improved. Even better would be the ability to support a standard scripting language/syntax which could be used to support different types of indicators which we may not have thought of yet. 

 

I think the only solution would be some type of report which would pull all of the BCG values at the desired OU level, apply the formula above, and return the result (or of course, some sort of database script which could automatically calculate the value ). 

 

Regards,

Jason

 

 

 

 

On Sun, Oct 5, 2014 at 11:55 AM, Dapo Adejumo <dapo_adejumo@xxxxxxxxx <mailto:dapo_adejumo@xxxxxxxxx> > wrote:

Hi Jason and Robin ( and Devs) ,

I decided to raise this question here since it is remotely related to the discussions below.

I want an indicator that has inputs beyond what is currently available for indicator definitions – for example

 

-        Percentage of Health facilities with BCG coverage below 50%

Number of Health facilities can be pulled in using the orgunit count but the challenge is the numerator (number of health facilities with coverage less than 50%) The BCG Coverage is calculated as a separate indicator but can technically be recalculated in the numerator definition – how can the 50% logic be introduced in the numerator formula?. The only work around I have thought of is the creation of a  dataelement like “ BCG Coverage less than 50%” that is populated by a script with a value 1 when coverage is less than 50% for the facility and then used as the numerator in the indicator calculation.

Jason and Robin have talked below on the possibility of extending the current configuration possibilities of  indicators probably including some Logic functions similar to what is in the Validation rules.

 

Has anybody dealt with similar scenarios like the  example above or any ideas on possible solutions.

Thanks!

 

 

………………………………………

Regards,

Dapo Adejumo

+2348033683677 <tel:%2B2348033683677> 

Skype : dapojorge

 

From: Dhis2-devs [mailto:dhis2-devs-bounces+dapo_adejumo <mailto:dhis2-devs-bounces%2Bdapo_adejumo> =yahoo.com@xxxxxxxxxxxxxxxxxxx <mailto:yahoo.com@xxxxxxxxxxxxxxxxxxx> ] On Behalf Of Jason Pickering
Sent: 16 September, 2014 9:06 AM
To: Robin Martens
Cc: dhis2-users@xxxxxxxxxxxxxxxxxxx <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx> ; dhis2-devs


Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

 

I think that is the real issue, namely that you are applying DHIS2 in a domain which is slightly different than it's typical domain, namely health. I have been involved in some other projects with DHIS2 on the fringes of what it can do out of the box in food security, water and sanitation, even using it for recording golf handicap scores. What I have seen in each of these domains is that there are some challenges with the way that the data is aggregated. Lots of things work out of the box, like data collection, user management and security, etc. But sometimes, the analysis needs to be done externally through other means. Of course, it would be great if DHIS2 could do all of this for all domains, but since its primary focus is on collection and management of health data, that is where things work most often (although there are some challenges there as well, particular on data which needs to be averaged or handled different in time or across orgunits, such as ART current count).  Contributions from the community are of course welcome! :)

 

Regards,

Jason

 

 

On Fri, Sep 12, 2014 at 8:55 AM, Robin Martens <martens@xxxxxxx <mailto:martens@xxxxxxx> > wrote:

Hi Jason,

 

Thanks for taking the time to read through my email.

 

I'll have a look at the different possibilities you proposed, and we'll be looking forward to any future upgrade of the calculation method (for now or later). I guess it's just that some sectors need more complex indicators than others (our project is in forest management).

 

Have a nice day,

 

Robin

 

From: Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx <mailto:jason.p.pickering@xxxxxxxxx> ] 
Sent: 11 September 2014 19:00


To: Robin Martens
Cc: Lars Helge Øverland; dhis2-users@xxxxxxxxxxxxxxxxxxx <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx> ; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

 

Your mail is dense and will need some digestion. :)

 

 You give a very good level of detail however of you problem in this mail and will be very useful as this type of functionality is attempted to be implemented. 

 

To respond immediately to how you might be able to solve the issue, you should possibly consider using the WebAPI to extract your data, process it as you need, and then inject it back into DHIS2. The WebAPI is described in detail here <https://www.dhis2.org/doc/snapshot/en/user/html/ch32.html> . I have also written a chapter on the use of the R programming language with DHIS2, which is particularly well suited to do the type of custom calculations you are describing here. It is available here <https://www.dhis2.org/doc/snapshot/en/user/html/apc.html> . Of course, other language/methods may also be more suited to your situation, such as Python. Lastly, you can have a look at the DHIS2 Ad-hoc tool <http://bazaar.launchpad.net/~dhis2-devs-core/dhis2/trunk/files/head:/tools/dhis-adhoc/>  which would allow interaction with the service layer of DHIS2. Another approach could be SQL which interacts directly with the database. I am sure there are many other means as well. So short answer is, right now there is no in-built way to achieve what you need I think, and it will take some coding on your side.

 

We have run into similar issues in the water and sanitation sector, where we need to work with the "latest reported data", which DHIS2 does not handle really. We pull out the data via the WebAPI, do the aggregation externally, and then inject everything back into the system to get the figures we need. It would be nice if the system did it automatically, but given the nature of the project, there are many feature requests and limited resources. Contributions of course are welcome. 

 

The current aggregation engine handles the "easy" cases of sums and averages pretty well, but for more complex stuff, external routes may be the only solution for now. 

 

We should certainly try and distill some of your ideas into a concrete blueprint. 

 

Best regards,

Jason

 

 

On Thu, Sep 11, 2014 at 6:15 PM, Robin Martens <martens@xxxxxxx <mailto:martens@xxxxxxx> > wrote:

Hi Jason,

 

I appreciate your help as this is very important for our project, thanks.

 

Some of our indicators are indeed quite complex and might need some custom coding if not too complicated. However, can you give some basic steps on how to achieve this (and on how hard this is in terms of programming as we're not experts here)?

 

---

 

The rest of this mail is about the specific issue I'm having here, it's basically related to three things:

 

1.       The absence of "cross-product" calculations in DHIS2 (I think it's what you call compulsory pairs of data).

2.       The fact that when no data exists on a disaggregated level, the value is taken to be zero instead of the aggregated (for custom dimensions only I think).

3.       The average function only exists over the time dimension (as discussed by Lars previously this week).

 

A simple example:

 


 

Population

Conso pp

Total


District 1

10

2

20


District 2

5

3

15


Total

15

5

35

 

When calculating the total national consumption, DHIS2 will do: aggregated population (=15) times aggregated consumption per person (=5) makes 75, which is wrong. In reality, the two mistakes are:

 

1.       The calculation should happen on district level before aggregating to the national value (20 for district1 plus 15 for district2 makes 35, which is the correct answer). -> Cross product

2.       DHIS2 always sums over orgunits (to be corrected soon according to Lars so I won't go further in detail here)

 

The cross-product issue can actually be "solved" by a workaround: obliging the user to explicitly show the disaggregation level (i.e. the level at which the cross product happens) in the report tables. Interestingly enough, when calculating the total in a report without showing districts, DHIS2 will return 75, while when showing the districts 35.

 

Imagine now that the consumption has three products (a custom category), ABC. The table would look like this:

 


 

Population

Conso pp A

Conso pp B

Conso pp C

Total A

Total B

Total C

Total


District 1

10

2

1

1

20

10

10

40


District 2

5

3

1

0

15

5

0

20


Total

15

5

2

1

35

15

10

60

 

The same principle, but aggregated over the Product category and orgunit dimension gives the correct result of 60. This is how DHIS2 would calculate:

 

1.       When not showing the Product category in the table: total population (15) x total aggregated consumption (=5+2+1=8) is 120.

2.       When showing the Product category in the table: total population (0, it will not find a value and return zero) x consumption is 0 !!!

 

Indeed, the workaround does work for orgunits but not for custom dimensions when not all data (in this case the population) has the same custom dimensions. 

 

I guess these are things that won't be solved quickly so I might need to do some coding myself. As a conclusion, to increase calculation power in DHIS2 I'd say:

 

1.       Use aggregated value when no disaggregated value exists (such as for population in the previous example).

2.       Aggregation operators (sum, average,...) should be defined per custom category and per data element. In other words, when creating a data element and adding categories, you have to add the operator for each category.

3.       Indicators should be available for re-use in other indicators. It enables you building complex indicators piece by piece and gives more flexibility on intermediate calculation (on disaggregated level).

 

I hope this is somewhat more clear.

 

Kind regards,

 

Robin

 

From: Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx <mailto:jason.p.pickering@xxxxxxxxx> ] 
Sent: 11 September 2014 16:30


To: Robin Martens
Cc: Lars Helge Øverland; dhis2-users@xxxxxxxxxxxxxxxxxxx <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx> ; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

You lost me. Could you maybe give a somewhat simpler example by what you mean by an "intermediary calculation"?

 

I am not sure exactly what you are trying to acheive, but what I can say is that in certain cases, I have had to write my own calculation methods for certain indicators which are basically impossible to calculate with the current implementation in DHIS2. It works fine for simple sums, averages, and other types of statistical things (standard deviation, etc), but for instance, if you want to calculate other statistical properties (skewness, kurtosis) of a given set of values, there is not a way to do it directly with DHIS2. Also, certain indicators depend on component parts, and cannot be calculated the way DHIS2 does it, by first summing up the numerator and denominator and then dividing it, as opposed to calculating a non-weighted average of compulsory pairs of data. What I am getting at, is that you may have to write your own calculation methods, depending on how complex they are. 

 

Regards,

Jason

 

 

On Thu, Sep 11, 2014 at 4:20 PM, Robin Martens <martens@xxxxxxx <mailto:martens@xxxxxxx> > wrote:

Hi Jason, 

 

To pick up the point again, there's an additional question I've been looking at. 

 

Even if disaggregated indicator reporting is burdensome (as you explain below), it is sometimes necessary for correct aggregated indicator calculations (the most obvious one the use of weighted averages) to have "intermediary calculations" according to dimensions in the indicator calculation, which can then be aggregated over the whole table to obtain the total aggregated indicator value. Even in these intermediary calculations, however, the data is not available for calculation, returning zero as a result.

 

The conclusion is that the current way of indicator calculation not only complicates (if not makes impossible in many cases) calculation of indicators per custom dimension, but also making impossible the correct calculation of indicators over period and orgunit dimension when any intermediary calculation over custom dimensions is necessary.

 

Can you confirm this?

 

If true, is it hard to modify the calculation method to simply pick the one-level-higher value of a data element whenever no disaggregated value exists? With existing I don't mean NULL or zero, but rather not defined (the dimension does not exist). 

 

Robin 

 

From: Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx] 
Sent: 10 September 2014 17:55
To: Robin Martens
Cc: Lars Helge Øverland; dhis2-users@xxxxxxxxxxxxxxxxxxx <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx> ; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

It has been a discussed, and certainly not a bug. See a related thread here ( <https://lists.launchpad.net/dhis2-devs/msg27571.html> https://lists.launchpad.net/dhis2-devs/msg27571.html) for a similar discussion on validation rules. It is essentially the same as indicators. What you will have to do is to create seperate indicator for each and every combination which you need. It can be painful, but the only way really which I know at the moment. 

 

Feel free to file a blueprint here.  <https://blueprints.launchpad.net/dhis2> https://blueprints.launchpad.net/dhis2

 

Regards,

Jason

 

 

On Wed, Sep 10, 2014 at 5:37 PM, Robin Martens < <mailto:martens@xxxxxxx> martens@xxxxxxx> wrote:

Dear all,

 

I've been testing the indicator calculation algorithm and noticed something particular of which I'm not sure if it's a bug or a deliberate development choice.

 

Indicators are not explicitly defined per category such as data elements but the reporting tools allow a disaggregated indicator calculation, which is definitely very useful. In a specific example, I want to know how many people were vaccinated this year and I have 3 kinds of vaccinations: A, B, and C. I have two data elements: the total population and the national vaccination levels (in %), with a custom category "vaccination type" which can be A, B, or C.

 

My indicator would be "total population" x "national vaccination level (total)". That works fine when put in a pivot table.

 

However, when trying to disaggregate the indicator calculation by adding my custom category to the pivot table, I don't have any values anymore. It seems the reason is that the "total population" data element does not have the "vaccination type" category (which seems logical) and therefore isn't found by the calculation algorithm. As a result, my table is empty. It seems useful that the algorithm would take the aggregated value (for population) available in such cases.

 

Another example is over the period dimension: my population is a yearly value, so when calculating an indicator on a monthly basis, instead of taking the available yearly value, he takes zero.

 

So my question: is this a deliberate choice in the development, a bug, or an idea for a future system improvement?

 

Kind regards,

 

Robin

 

 


_______________________________________________
Mailing list:  <https://launchpad.net/~dhis2-devs> https://launchpad.net/~dhis2-devs
Post to     :  <mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx> dhis2-devs@xxxxxxxxxxxxxxxxxxx
Unsubscribe :  <https://launchpad.net/~dhis2-devs> https://launchpad.net/~dhis2-devs
More help   :  <https://help.launchpad.net/ListHelp> https://help.launchpad.net/ListHelp





 

-- 

Jason P. Pickering
email:  <mailto:jason.p.pickering@xxxxxxxxx> jason.p.pickering@xxxxxxxxx
 <tel:+46764147049> tel:+46764147049





 

-- 

Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx <mailto:jason.p.pickering@xxxxxxxxx> 
tel:+46764147049





 

-- 

Jason P. Pickering
email:  <mailto:jason.p.pickering@xxxxxxxxx> jason.p.pickering@xxxxxxxxx
 <tel:+46764147049> tel:+46764147049





 

-- 

Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx <mailto:jason.p.pickering@xxxxxxxxx> 
tel:+46764147049





 

-- 

Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx <mailto:jason.p.pickering@xxxxxxxxx> 
tel:+46764147049


References