← Back to team overview

dhis2-devs team mailing list archive

Re: The use of dimensions in data entry and data analysis (was: commit message for Rev 938)

 

Don't really have much time to contribute to this discussion right now, but
...

2009/10/30 Ola Hodne Titlestad <olatitle@xxxxxxxxx>

> 2009/10/30 Jason Pickering <jason.p.pickering@xxxxxxxxx>
>
>> Perhaps it is a bad example but it raises a good point, and we might
>> should move this to a new thread if it continues to balloon.
>>
>> I changed the name of the subject, might be to general, but still better
> than a reply to a commit message.
>
>
>>  My understanding was the category options would be used for data
>> entry.  This is not really an issue about 1.4, it is really an issue
>> about whether people will enter totals or not. There is nothing to
>> prevent people from defining a category , Gender, with three (or more)
>> options, "Male" "Female" and "Total", and it may be necessary. Let me
>> explain.  On the paper tools used here in Zambia, there is a separate
>> column "Total" which is the sum of three age groups (Under 1, 1-5 and
>> Over 5). If I was going to implement the multidimensional data
>> elements here, if I wanted to replicate the paper tool exactly, I
>> would need a separate column for totals. This is what we have now, and
>> it serves a good purpose, as the data entry personnel can see if the
>> totals provided by the facility actually match the calculated totals.
>>
>
> This raises an interesting point related to the discussion we have had
> about the role of data sets and data entry forms. To me such a control
> column like "total" is simply a GUI feature and I don't think it should be
> reflected in the data model or persisted.
>
> It would be great if we could add this feature to our data entry module.
> What I see here is a need for an option to add a total column to each
> categorycombination and then to automatically populate this field as the
> other fields of the row gets filled. This is not a new request as it has
> been mentioned several times (I remember a quite heated discussion about the
> use of calculated data elements a few years ago), but with a new take on the
> data set and form relation and a refined multidimensional model this might
> be a better time to look at this.
>
> And I agree with Bob, to get these totals in a report is a matter of adding
> this to the GUI somehow, the ability to add total columns for data elements
> + category combos.
>
>
>
>
>> No idea if this is how the categories work in DHIS2. But from the
>> analysis standpoint, it would seem that you would need some calculated
>> data element as well that would calculate the total from the
>> multidimensional components of the data element, unless as you say,
>> you are going to rely on OLAP or PivotTables to always do this
>> aggregation for you.
>
>
> At least for categories and options there should be no need to go to OLAP
> to get this.
> And although more complicated, I would think it should be possible to also
> extract totals from a data element group set model with a similar logic to
> what I described earlier. I guess that is the point of the new dimension
> service which abstract away the difference between categories and group
> sets, is that correct Lars/Bob?
>

My (radical) idea on this is that a GroupSet should actually "BE" a
dataelement.  Reason comes down to the fact that values have dimensions.
And those dimensions can be different depending on the dataelement used.

eg (using shorthand)

Here's a datavalue in its "raw" form
<dv de="Immunization_Male_Under5"  Value="5"/>
Now lets say there are groups gender and age defined of which the above is a
member.  And a groupset Immunization.  Then here's the same datavalue
<dv de="Immunization" gender="M" Age="<5" Value="5"/>
Now what about that same de, but without the dimensions:
<dv de="Immunization"  Value="105"/>

where I guess 105 would be the Total of all the underlying datavalues.

In fact what would be very nice would be to do away with groups/groupsets
entirely.  Less is more.  Just have (calculated?) dataelements which can
form hierarchies (like orgunits).  We're not too far from here at the
moment.  Another little step and we'll be over the edge.

I'll think more about this later.  Right now in a rush to implement dxf2
parser ...

Cheers
Bob



>
>> I would think that actually having the ability to
>> persist and store the data value, as a calculated data element (Save
>> calculated) and assign it a Category option of "Total" (which might be
>> implicit anyway in the system) would make sense, since you might need
>> it directly in a report or something and do not want to have to revert
>> to OLAP or custom SQL to get this. But again, I am looking at this
>> from the perspective of a bunch of data elements which do not use
>> category options.
>>
>> You would get the totals as you state, but only by using OLAP. What
>> about if I want to create an Excel report with only Totals? Now if the
>> new model will automatically give me the totals from the component
>> dimensions, great, but I did not see this in the blueprint.
>
>
> You are right, getting total from the group set/groups part of
> dimension/dimensionoptions was not covered I think.
> We need to add this to the blueprint. The idea was to abstract away the
> difference between categories and group sets at the point of data analysis,
> e.g. when defining new report tables, so I guess this means more complexity
> to the dimension service Lars is working on.
>
> Ola
> ---------
>
>
> I was
>> assuming that I would need explicitly define a separate, calculated
>> element for this.
>>
>> Regards,
>> Jason
>>
>>
>> On Fri, Oct 30, 2009 at 5:34 PM, Ola Hodne Titlestad <olatitle@xxxxxxxxx>
>> wrote:
>> > 2009/10/30 Jason Pickering <jason.p.pickering@xxxxxxxxx>
>> >>
>> >> OK, I took a walk around the block to think about this a bit more. I
>> >> think it does, make sense, sort of. Lets look at  "Total", which might
>> >> be defined as a calculated data element, say composed of different age
>> >> groups. But the "Total" in this category, would not be the same as the
>> >> "Total" that might be defined in a different category, or would it?
>> >>
>> >
>> > I thought the whole point of the category/categoryoption/categorycombo
>> model
>> > was that the total would be the data element itself without any
>> > categoryoption? The "total" should then not be defined as one of the
>> > options, but be always be derived from the sum of all the options.
>> >
>> > Your example Jason is from a 1.4 design point of view where you are not
>> > using this model, but normally need calculated data elements to get to a
>> > total (since the categoryoptions are part of the data element names).
>> With
>> > the new data element group set model I guess you can derive the total
>> for
>> > e.g.  "Malaria new cases OPD" e.g. by filtering on the data element
>> group
>> > "Malaria" in the group set "Diseases" plus the group called "New cases"
>> in
>> > the group set "Patient status" and then simply sum up all the data
>> elements
>> > in the two groups sets "Gender" and "Morbidity age group". Would't such
>> an
>> > approach give you the totals you need?
>> >
>> > As in exactly how we could accommodate that within DHIS2 e.g in a report
>> > table GUI I am not sure. Seems complicated and something for an OLAP
>> tool to
>> > take care of.
>> >
>> > Ola
>> > -----------
>> >
>> >>  Having a single categoryoption "Total" would allow one to slice out
>> >> particular groups of dimensional elements, which is a fairly common
>> >> operation as Ola mentions, with a single filter statement. Otherwise,
>> >> you would need to collect all of the "Total"s for different categories
>> >> through another table and perform an inner join, as opposed to a
>> >> filter. For multiple category options, I guess there would need to be
>> >> a decision made whether to perform an inner join or loop through a
>> >> filter, but I guess an inner join would actually be better for either
>> >> one or many category options (have not looked at the code). If the
>> >> uniqueness contraint is not there, the user would need to select in a
>> >> separate step to select all "Total"s and then perform an inner join,
>> >> as there would be no intrinsic relationship between "Total" in the
>> >> "Age" category and the "Total" in the "Gender" category. This might be
>> >> very tedious if there are many categories to select from. Having
>> >> multiple category options with the same name does not make sense in
>> >> this case, and I think this is what everyone is saying?
>> >>
>> >>
>> >>
>> >> Obviously  there should not be two category options called "Total" to
>> >> be within a single category/data element group set. However,I am not
>> >> sure I understand completely your point Ola. To me, the use case you
>> >> describe is very typical. "Give me all data for the under 1 age
>> >> group", "Give me all data on in patient discharges". Having to define
>> >> multiple "under 1" and "IPD" for each category seems to be very
>> >> inefficient, as well as painful.
>> >>
>> >> So, I guess maybe I am answering my own mail...I think.
>> >>
>> >>
>> >>
>> >>
>> >> 2009/10/30 Lars Helge Øverland <larshelge@xxxxxxxxx>:
>> >> >
>> >> >
>> >> > On Fri, Oct 30, 2009 at 2:43 PM, Jason Pickering
>> >> > <jason.p.pickering@xxxxxxxxx> wrote:
>> >> >>
>> >> >> Could some one remind me once again what the point of having a
>> >> >> category option in two separate categories is? is there a use case
>> >> >> here? It does not seem totally obvious, but maybe I am missing
>> >> >> something.
>> >> >>
>> >> >
>> >> > It might be that there are none. This could be useful in the sense
>> that
>> >> > if
>> >> > nobody asks for removing the constraint - we won't.
>> >> >
>> >> >
>> >> >
>> >>
>> >> _______________________________________________
>> >> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>> >> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> >> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>> >> More help   : https://help.launchpad.net/ListHelp
>> >
>> >
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References