← Back to team overview

dhis2-devs team mailing list archive

Re: [Dhis2-users] Generating Min/ Max

 

Lars,

Excellent - thanks for that. Two years is a reasonable default value -
we've always used 18 months as the default in 1.4, so almost the same.

I would nevertheless argue that

(a) user-defined period, stdev value, and possibly average/median
parameters should ideally be specified on a per data element basis;

(b) adding the attribute option combo to the mix is probably required to
cater for instances where data is captured for e.g. multiple collaborating
NGOs;

(c) tools enabling the specification of said parameters for larger groups
of data elements will make it easier to manage.

(d) a cherry on top would be the ability to adjust for typical seasonal
fluctuations.

I will try to write a blue-print for something like the above, not a
critical need, but a positive step.

Regards
Calle

On 3 May 2015 at 14:12, Lars Helge Øverland <larshelge@xxxxxxxxx> wrote:

> Hi Calle,
>
> I agree it makes sense to have a "from date" for the data values to
> include in the std dev and average calculation. I have changed it so it now
> includes data 2 years before the start date of the validation analysis
> period. I also helps on performance of the validation process.
>
> regards,
>
> Lars
>
> On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
> wrote:
>
>> Hi
>>
>> "Calle is right here - we do average, then calculate std dev and set the
>> upper and lower bounds for each value. We use data from ALL available
>> time periods to calculate this (period org unit, data element, option
>> combo)."
>>
>> Here and there and back again :-)
>>
>> So I wasn't off the reservation, then. We have used the normal
>> distribution like this in DHIS 1.x for around 17 years, and it fits the
>> majority of data elements. In general, this distribution model handles
>> random outbreaks and disruptions reasonably well, since the impact of such
>> outliers are dampened. Data elements representing conditions or services
>> with strong seasonal variation do not fit so well, and some very particular
>> issues like "Male condoms distributed" tend to vary so much that the
>> min/max is generally disregarded (outliers here also matter a lot less -
>> when you distribute 1-2 billion condoms annually, an error of a few
>> thousand does not matter). In DHIS 1.4 there is also a function for setting
>> absolute min-max values - most typically used for data elements where e.g.
>> only 0 and 1 are valid values. For such cases, statistically calculating
>> min-max is obviously irrelevant.
>>
>> I don't like the use of ALL available time periods, though, since a large
>> number of health facilities will see significant changes in their patient
>> mix and patient numbers over let us say a 10 year period. We have found
>> that 12-18 months provide a good compromise.
>>
>> So there are still some room for improvement.
>>
>> Regards
>> Calle
>>
>> On 20 April 2015 at 16:15, Jason Pickering <jason.p.pickering@xxxxxxxxx>
>> wrote:
>>
>>> Good. I probably should have known that already, thus why I had to do
>>> some statistical analysis outside of DHIS2 to actually calculate reasonable
>>> min max. A quick check of the validity of a normal distribution, can be
>>> with the skewness and kurtosis , which provide a idea of how "tilted"  a
>>> given distribution is.
>>>
>>> https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
>>>
>>> Anyway, support for import via the API would be good.
>>>
>>> Regards,
>>> Jason
>>>
>>> On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland <larshelge@xxxxxxxxx>
>>> wrote:
>>>
>>>> Hi there,
>>>>
>>>> Calle is right here - we do average, then calculate std dev and set the
>>>> upper and lower bounds for each value.
>>>>
>>>> We use data from ALL available time periods to calculate this (period
>>>> org unit, data element, option combo)
>>>>
>>>> Mind you we should not really debate whether to use standard deviations
>>>> or not, rather if we should support additional _distributions_ to better
>>>> handle different kinds of data. We currently use the normal
>>>> distribution <http://en.wikipedia.org/wiki/Normal_distribution>.
>>>>
>>>> Rodolfo - supporting min-max in the Web API is a good idea to allow for
>>>> third-party tools - feel free to write a blueprint.
>>>>
>>>> regards,
>>>>
>>>> Lars
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>> --
>>
>> *******************************************
>>
>> Calle Hedberg
>>
>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>
>> Tel/fax (home): +27-21-685-6472
>>
>> Cell: +27-82-853-5352
>>
>> Iridium SatPhone: +8816-315-19274
>>
>> Email: calle.hedberg@xxxxxxxxx
>>
>> Skype: calle_hedberg
>>
>> *******************************************
>>
>>
>


-- 

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@xxxxxxxxx

Skype: calle_hedberg

*******************************************

Follow ups

References