dhis2-devs team mailing list archive
-
dhis2-devs team
-
Mailing list archive
-
Message #39984
Re: [Bug 1065014] Re: Min/Max generation goes into negative
Hi Calle,
The problem is the premise upon which this algorithm is based is flawed, I
would say. There is really no reason to believe that the data is normally
distributed, or should be, unless of course it has been proved to be a
reliable and appropriate model. What we are seeking to do is to eliminate
outliers, based on a certain statistical model (i.e the normal
distribution). Problem is, the data is often not normally distributed. Just
as a quick example, I prepare a density plot of the skewness of all
OU/DE/COC combinations for a real database with a significant amount of
data over time, which should be fairly representative of a "real" DHIS
database. As a very trivial test of normality, we can examine the skewness
and see that in fact, showing that the tendency for the database is towards
positive skew, which is somewhat expected, as there are probably going to
be fewer "higher" values than "low" values for many data elements. Zero
skewness implied a perfectly normal distribution.
I still think we need to carefully document what the min-max generation
function is actually doing. If it works for people, great, but with all of
the truncation of data going on, it may not really be clear to people how
these values are actually generated, nor what their limitation may be, as
well as to introduce an API endpoint for the min-max values to allow people
to generate these outside of the system, based on perhaps more appropriate
models than the normal distribution.
Regards,
Jason
On Fri, Sep 18, 2015 at 9:02 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
wrote:
> Hi,
>
> Ah - bugger, I completely forgot about then zero or positive type, which
> provides the same effect (if set). my bad..
>
> Jason's point is correct, but in my opinion less important for most types
> of routine data where the primary function of the min-max values is to
> highlight likely data capturing mistakes.
>
> Regards
> Calle
>
> On 18 September 2015 at 13:10, jason.p.pickering <
> 1065014@xxxxxxxxxxxxxxxxxx> wrote:
>
>> Hi there. The current design is to take the mean, and calculate n-standard
>> deviations away from the mean, for a given data element/orgunit/catcombo
>> set of data values. If the data value is set to be zero or positive
>> integer, and can never have a negative value and does not follow a normal
>> distribution, then flooring the projected min/max at zero makes little
>> sense, if the distribution is not normal. Another distribution would be
>> required to determine what the accepted min/max actually are (logistical,
>> zero-inflated model, etc) if the actual distribution is not normal.
>>
>> But per the bug report, the application does what it is supposed to do,
>> namely calculate the theoretical min/max based on a stastical routine,
>> which itself may not be valid without confirming that the distribution in
>> question actually is normal or not.
>>
>> Regards,
>> Jason
>>
>>
>> On Fri, Sep 18, 2015 at 11:57 AM, Lars Helge Øverland <
>> larshelge@xxxxxxxxx>
>> wrote:
>>
>> > This is not a design flaw. It depends on the data element value type
>> > property. The default value type is "number", for which negative values
>> > are perfectly valid. One can set the value type to "Positive number", in
>> > this case the min-max values will never be less than zero.
>> >
>> > ** Changed in: dhis2
>> > Status: Opinion => Invalid
>> >
>> > --
>> > You received this bug notification because you are a member of DHIS 2
>> > developers, which is subscribed to DHIS.
>> > https://bugs.launchpad.net/bugs/1065014
>> >
>> > Title:
>> > Min/Max generation goes into negative
>> >
>> > Status in DHIS:
>> > Invalid
>> >
>> > Bug description:
>> > A very minor bug, but the min/max generation algorithm (which I assume
>> > is some std. dev) sometimes leads the minimum to be a negative number.
>> > Probably not an issue per se for data quality, as the alternative
>> > would be to set it to 0 (unless there is a reason why you would enter
>> > negative numbers), but the chart you get when you double-click a data
>> > entry field is then skewed and does not look very sensible. In extreme
>> > cases, with a few very high values and a few months with very low (as
>> > when you have campaigns or hand-outs), the minimum can be down to
>> > minus a lot.
>> >
>> > To manage notifications about this bug go to:
>> > https://bugs.launchpad.net/dhis2/+bug/1065014/+subscriptions
>> >
>> > _______________________________________________
>> > Mailing list: https://launchpad.net/~dhis2-devs
>> > Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> > Unsubscribe : https://launchpad.net/~dhis2-devs
>> > More help : https://help.launchpad.net/ListHelp
>> >
>>
>>
>> --
>> Jason P. Pickering
>> email: jason.p.pickering@xxxxxxxxx
>> tel:+46764147049
>>
>> --
>> You received this bug notification because you are a member of DHIS 2
>> developers, which is subscribed to DHIS.
>> https://bugs.launchpad.net/bugs/1065014
>>
>> Title:
>> Min/Max generation goes into negative
>>
>> Status in DHIS:
>> Invalid
>>
>> Bug description:
>> A very minor bug, but the min/max generation algorithm (which I assume
>> is some std. dev) sometimes leads the minimum to be a negative number.
>> Probably not an issue per se for data quality, as the alternative
>> would be to set it to 0 (unless there is a reason why you would enter
>> negative numbers), but the chart you get when you double-click a data
>> entry field is then skewed and does not look very sensible. In extreme
>> cases, with a few very high values and a few months with very low (as
>> when you have campaigns or hand-outs), the minimum can be down to
>> minus a lot.
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/dhis2/+bug/1065014/+subscriptions
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help : https://help.launchpad.net/ListHelp
>>
>
>
>
> --
>
> *******************************************
>
> Calle Hedberg
>
> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>
> Tel/fax (home): +27-21-685-6472
>
> Cell: +27-82-853-5352
>
> Iridium SatPhone: +8816-315-19119
>
> Email: calle.hedberg@xxxxxxxxx
>
> Skype: calle_hedberg
>
> *******************************************
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help : https://help.launchpad.net/ListHelp
>
>
--
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+46764147049
Attachment:
skewness.png
Description: PNG image
Follow ups
References