← Back to team overview

dhis2-users team mailing list archive

Re: [Dhis2-devs] Generating Min/ Max

 

Lars,

By the way - last week I saw that the bug related to OrgUnit counts in
indicators is still there. I checked the Sierra Leone demo too, and it's
the same - integrity checks are still showing "org-unit-do not exist" etc.

Is this an indicator bug or an integrity check bug?

Regards
Calle

On 3 May 2015 at 21:29, Calle Hedberg <calle.hedberg@xxxxxxxxx> wrote:

> Lars,
>
> Excellent - thanks for that. Two years is a reasonable default value -
> we've always used 18 months as the default in 1.4, so almost the same.
>
> I would nevertheless argue that
>
> (a) user-defined period, stdev value, and possibly average/median
> parameters should ideally be specified on a per data element basis;
>
> (b) adding the attribute option combo to the mix is probably required to
> cater for instances where data is captured for e.g. multiple collaborating
> NGOs;
>
> (c) tools enabling the specification of said parameters for larger groups
> of data elements will make it easier to manage.
>
> (d) a cherry on top would be the ability to adjust for typical seasonal
> fluctuations.
>
> I will try to write a blue-print for something like the above, not a
> critical need, but a positive step.
>
> Regards
> Calle
>
> On 3 May 2015 at 14:12, Lars Helge Øverland <larshelge@xxxxxxxxx> wrote:
>
>> Hi Calle,
>>
>> I agree it makes sense to have a "from date" for the data values to
>> include in the std dev and average calculation. I have changed it so it now
>> includes data 2 years before the start date of the validation analysis
>> period. I also helps on performance of the validation process.
>>
>> regards,
>>
>> Lars
>>
>> On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
>> wrote:
>>
>>> Hi
>>>
>>> "Calle is right here - we do average, then calculate std dev and set
>>> the upper and lower bounds for each value. We use data from ALL
>>> available time periods to calculate this (period org unit, data element,
>>> option combo)."
>>>
>>> Here and there and back again :-)
>>>
>>> So I wasn't off the reservation, then. We have used the normal
>>> distribution like this in DHIS 1.x for around 17 years, and it fits the
>>> majority of data elements. In general, this distribution model handles
>>> random outbreaks and disruptions reasonably well, since the impact of such
>>> outliers are dampened. Data elements representing conditions or services
>>> with strong seasonal variation do not fit so well, and some very particular
>>> issues like "Male condoms distributed" tend to vary so much that the
>>> min/max is generally disregarded (outliers here also matter a lot less -
>>> when you distribute 1-2 billion condoms annually, an error of a few
>>> thousand does not matter). In DHIS 1.4 there is also a function for setting
>>> absolute min-max values - most typically used for data elements where e.g.
>>> only 0 and 1 are valid values. For such cases, statistically calculating
>>> min-max is obviously irrelevant.
>>>
>>> I don't like the use of ALL available time periods, though, since a
>>> large number of health facilities will see significant changes in their
>>> patient mix and patient numbers over let us say a 10 year period. We have
>>> found that 12-18 months provide a good compromise.
>>>
>>> So there are still some room for improvement.
>>>
>>> Regards
>>> Calle
>>>
>>> On 20 April 2015 at 16:15, Jason Pickering <jason.p.pickering@xxxxxxxxx>
>>> wrote:
>>>
>>>> Good. I probably should have known that already, thus why I had to do
>>>> some statistical analysis outside of DHIS2 to actually calculate reasonable
>>>> min max. A quick check of the validity of a normal distribution, can be
>>>> with the skewness and kurtosis , which provide a idea of how "tilted"  a
>>>> given distribution is.
>>>>
>>>> https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
>>>>
>>>> Anyway, support for import via the API would be good.
>>>>
>>>> Regards,
>>>> Jason
>>>>
>>>> On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland <larshelge@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Hi there,
>>>>>
>>>>> Calle is right here - we do average, then calculate std dev and set
>>>>> the upper and lower bounds for each value.
>>>>>
>>>>> We use data from ALL available time periods to calculate this (period
>>>>> org unit, data element, option combo)
>>>>>
>>>>> Mind you we should not really debate whether to use standard
>>>>> deviations or not, rather if we should support additional _distributions_
>>>>> to better handle different kinds of data. We currently use the normal
>>>>> distribution <http://en.wikipedia.org/wiki/Normal_distribution>.
>>>>>
>>>>> Rodolfo - supporting min-max in the Web API is a good idea to allow
>>>>> for third-party tools - feel free to write a blueprint.
>>>>>
>>>>> regards,
>>>>>
>>>>> Lars
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>> --
>>>
>>> *******************************************
>>>
>>> Calle Hedberg
>>>
>>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>>
>>> Tel/fax (home): +27-21-685-6472
>>>
>>> Cell: +27-82-853-5352
>>>
>>> Iridium SatPhone: +8816-315-19274
>>>
>>> Email: calle.hedberg@xxxxxxxxx
>>>
>>> Skype: calle_hedberg
>>>
>>> *******************************************
>>>
>>>
>>
>
>
> --
>
> *******************************************
>
> Calle Hedberg
>
> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>
> Tel/fax (home): +27-21-685-6472
>
> Cell: +27-82-853-5352
>
> Iridium SatPhone: +8816-315-19274
>
> Email: calle.hedberg@xxxxxxxxx
>
> Skype: calle_hedberg
>
> *******************************************
>
>


-- 

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@xxxxxxxxx

Skype: calle_hedberg

*******************************************

References