dhis2-devs team mailing list archive
-
dhis2-devs team
-
Mailing list archive
-
Message #39986
Re: [Bug 1065014] Re: Min/Max generation goes into negative
Jason,
"with all the truncation of data going on" - ??
Not sure what you mean by that, but I users don't regard min-max values as
a kind of "hard" range - it has never been intended to be that, except (at
least in DHIS 1.4) where you specify a min-max to be ABSOLUTE. For
everything else, it is a simple method to highlight possible outliers - for
typing mistakes an easy fix, for collection/collation/transcribing mistakes
often a more involved process (query sent back to staff etc).
The complexity of correcting mistakes made during the manual data
collection and collation process - often made worse by people tend to be
stubborn about not admitting mistakes - is one reason for moving electronic
data capture closer to the actual patient encounters. A typical example is
South Africa's move from capturing monthly data per facility to capturing
data into the DHIS on a daily basis per consulting room)
Regards
Calle
On 19 September 2015 at 17:26, Jason Pickering <jason.p.pickering@xxxxxxxxx>
wrote:
> Hi Calle,
> The problem is the premise upon which this algorithm is based is flawed, I
> would say. There is really no reason to believe that the data is normally
> distributed, or should be, unless of course it has been proved to be a
> reliable and appropriate model. What we are seeking to do is to eliminate
> outliers, based on a certain statistical model (i.e the normal
> distribution). Problem is, the data is often not normally distributed. Just
> as a quick example, I prepare a density plot of the skewness of all
> OU/DE/COC combinations for a real database with a significant amount of
> data over time, which should be fairly representative of a "real" DHIS
> database. As a very trivial test of normality, we can examine the skewness
> and see that in fact, showing that the tendency for the database is towards
> positive skew, which is somewhat expected, as there are probably going to
> be fewer "higher" values than "low" values for many data elements. Zero
> skewness implied a perfectly normal distribution.
>
> I still think we need to carefully document what the min-max generation
> function is actually doing. If it works for people, great, but with all of
> the truncation of data going on, it may not really be clear to people how
> these values are actually generated, nor what their limitation may be, as
> well as to introduce an API endpoint for the min-max values to allow people
> to generate these outside of the system, based on perhaps more appropriate
> models than the normal distribution.
>
> Regards,
> Jason
>
>
> On Fri, Sep 18, 2015 at 9:02 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
> wrote:
>
>> Hi,
>>
>> Ah - bugger, I completely forgot about then zero or positive type, which
>> provides the same effect (if set). my bad..
>>
>> Jason's point is correct, but in my opinion less important for most types
>> of routine data where the primary function of the min-max values is to
>> highlight likely data capturing mistakes.
>>
>> Regards
>> Calle
>>
>> On 18 September 2015 at 13:10, jason.p.pickering <
>> 1065014@xxxxxxxxxxxxxxxxxx> wrote:
>>
>>> Hi there. The current design is to take the mean, and calculate
>>> n-standard
>>> deviations away from the mean, for a given data element/orgunit/catcombo
>>> set of data values. If the data value is set to be zero or positive
>>> integer, and can never have a negative value and does not follow a normal
>>> distribution, then flooring the projected min/max at zero makes little
>>> sense, if the distribution is not normal. Another distribution would be
>>> required to determine what the accepted min/max actually are (logistical,
>>> zero-inflated model, etc) if the actual distribution is not normal.
>>>
>>> But per the bug report, the application does what it is supposed to do,
>>> namely calculate the theoretical min/max based on a stastical routine,
>>> which itself may not be valid without confirming that the distribution in
>>> question actually is normal or not.
>>>
>>> Regards,
>>> Jason
>>>
>>>
>>> On Fri, Sep 18, 2015 at 11:57 AM, Lars Helge Øverland <
>>> larshelge@xxxxxxxxx>
>>> wrote:
>>>
>>> > This is not a design flaw. It depends on the data element value type
>>> > property. The default value type is "number", for which negative values
>>> > are perfectly valid. One can set the value type to "Positive number",
>>> in
>>> > this case the min-max values will never be less than zero.
>>> >
>>> > ** Changed in: dhis2
>>> > Status: Opinion => Invalid
>>> >
>>> > --
>>> > You received this bug notification because you are a member of DHIS 2
>>> > developers, which is subscribed to DHIS.
>>> > https://bugs.launchpad.net/bugs/1065014
>>> >
>>> > Title:
>>> > Min/Max generation goes into negative
>>> >
>>> > Status in DHIS:
>>> > Invalid
>>> >
>>> > Bug description:
>>> > A very minor bug, but the min/max generation algorithm (which I
>>> assume
>>> > is some std. dev) sometimes leads the minimum to be a negative
>>> number.
>>> > Probably not an issue per se for data quality, as the alternative
>>> > would be to set it to 0 (unless there is a reason why you would enter
>>> > negative numbers), but the chart you get when you double-click a data
>>> > entry field is then skewed and does not look very sensible. In
>>> extreme
>>> > cases, with a few very high values and a few months with very low (as
>>> > when you have campaigns or hand-outs), the minimum can be down to
>>> > minus a lot.
>>> >
>>> > To manage notifications about this bug go to:
>>> > https://bugs.launchpad.net/dhis2/+bug/1065014/+subscriptions
>>> >
>>> > _______________________________________________
>>> > Mailing list: https://launchpad.net/~dhis2-devs
>>> > Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>> > Unsubscribe : https://launchpad.net/~dhis2-devs
>>> > More help : https://help.launchpad.net/ListHelp
>>> >
>>>
>>>
>>> --
>>> Jason P. Pickering
>>> email: jason.p.pickering@xxxxxxxxx
>>> tel:+46764147049
>>>
>>> --
>>> You received this bug notification because you are a member of DHIS 2
>>> developers, which is subscribed to DHIS.
>>> https://bugs.launchpad.net/bugs/1065014
>>>
>>> Title:
>>> Min/Max generation goes into negative
>>>
>>> Status in DHIS:
>>> Invalid
>>>
>>> Bug description:
>>> A very minor bug, but the min/max generation algorithm (which I assume
>>> is some std. dev) sometimes leads the minimum to be a negative number.
>>> Probably not an issue per se for data quality, as the alternative
>>> would be to set it to 0 (unless there is a reason why you would enter
>>> negative numbers), but the chart you get when you double-click a data
>>> entry field is then skewed and does not look very sensible. In extreme
>>> cases, with a few very high values and a few months with very low (as
>>> when you have campaigns or hand-outs), the minimum can be down to
>>> minus a lot.
>>>
>>> To manage notifications about this bug go to:
>>> https://bugs.launchpad.net/dhis2/+bug/1065014/+subscriptions
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~dhis2-devs
>>> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>
>>
>>
>> --
>>
>> *******************************************
>>
>> Calle Hedberg
>>
>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>>
>> Tel/fax (home): +27-21-685-6472
>>
>> Cell: +27-82-853-5352
>>
>> Iridium SatPhone: +8816-315-19119
>>
>> Email: calle.hedberg@xxxxxxxxx
>>
>> Skype: calle_hedberg
>>
>> *******************************************
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+46764147049
>
--
*******************************************
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@xxxxxxxxx
Skype: calle_hedberg
*******************************************
Follow ups
References