← Back to team overview

dhis2-devs team mailing list archive

Re: Fwd: On categories and dimensions and zooks

 

Here comes my shot at this issue. I'm gonna use Ola's example as a basis.

<!-- start -->
*
*The flat data element names:
"Malaria death <5 year"
"Malaria death >5 year"
"Malaria in OPD 1st attendance <5 year"
"Malaria in OPD 1st attendance >5 year"
"Malaria IP discharge <5 year"
"Malaria IP discharge >5 year"
"Typhoid death <5 year"
"Typhoid death >5 year"
etc.
(OPD is outpatient, patients treated at the clinic, IP is inpatient meaning
patients that was admitted to a hospital).

There are three dimensions in the data elements above, so I define three
data element group sets:
Disease, Patient Status, and Age.
I also define 7 new data element groups (Malaria, Typhoid, <5, >5, Death,
OPD, IP) and assign these groups to the group set they belong to:
Disease (Malaria, Typhoid)
Patient Status (Death, OPD, IP)
Age (<5, >5)

I then assign the data element groups to the data elements
"Malaria death <5 year" assigned to "Malaria", "Death", and "<5".
etc.

All these groupings can exist completely independent of data entry and be
changed at any time.
>From this I can generate a new resource table for my data analysis (similar
to the one we already have for orgunit group sets) that provides:
Data Element Group Set, Data Element Group, Data Element
"Disease", "Malaria", "Malaria death <5 year",
"Disease", "Typhoid", "Typhoid death <5 year"
"Patient Status", "Death", "Malaria death <5 year"
etc.

When joining the above table with an aggregated data value table you can
define a pivot table with your three data element group sets as columns
(pivot fields) and analyse the data across these three dimensions. The data
element name dimension can then be completely hidden in the analysis.

<!-- end -->


Some observations:


a) From this we can derive that a GroupSet corresponds to a Dimension and
that a Group corresponds to a DimensionOption.

Dimension = GroupSet
DimensionOption = Group


b) The current Category model and the suggested simplified version both
generate CategoryOptionCombos/DimensionElementCombinations which are linked
to DataValue and constitute all possible combinations of their associated
CategoryOptions/DimensionOptions. This means that once those
CategoryOptionCombos/ DimensionElementCombinations are generated and
DataValues are registered for them, they cannot change. Also, once a data
entry grid is defined, the underlying model cannot change. According to Ola
and Jason we must be able to assign "any dimension to a DataElement" at any
time. To me this rules out re-using the same dimensional attributes for data
entry and analysis - we must in any case have on set of dimensions for data
entry and one set of dimensions for analysis.


c) Ola's suggested solution supports this. It is powerful in the ability to
assign "raw" DataElements to Dimensions/GroupSets through
DimensionOptions/Groups, completely independent of which Categories the
DataElement was assigned to for data entry. The weakness is that it is based
on flat data elements, not Categorized data elements, which we must include
if we are to justify the Categorized data entry.


d) The Category model is pretty good at what it currently does -
facilitating grid-based dataentry and cutting down on the number of data
elements (as well as making the data element naming more elegant).


Based on this I suggest we do the following:

1) We continue to use the Category model as it is, not for analysis - but
for data entry.

2) Taken from Bob's suggestion - we phase out the existing Group and replace
it with a new DimensionOption object. We introduce a new Dimension object
which will work similarly to a GroupSet. We use this model for analysis.

3) We go for Ola's mentioned suggestion for analysis, with one exception:
Rather than assigning DataElements to a Group/DimensionOption, we assign a
combination of DataElement and CategoryOptionCombo (We create a new object
for this for every assignment - and remove it for every de-assignment). If
we want to see the total, we can assign a DataElement with the "default"
CategoryOptionCombo, or create a DimensionOption where the elements make a
total when summarized.

4) We use the same thing for Indicators.


The resource table Ola mentions will then look like this:

Group Set -Group - Data Element - CategoryOptionCombo

"Disease" - "Malaria" - "Malaria" - "(death, <5 year)"
"Disease" - "Typhoid" - "Typhoid" - "(death, >5 year)"


This way we can assign dimensions as we like without loosing the fine
granularity of the captured categorized data. We can improve the report
table functionality in order to utilize this. This will be feasible with the
time and resource constraints we are operating with. It also alleviates the
challenge regarding Indicators and SDMX.


Additionally, one could expand the quotation from a) to:

Dimension = GroupSet = Category
DimensionOption = Group = CategoryOption

which means there is potential in merging those objects/making them
implement a common interface. But I don't see the value if b) is valid.


Waiting for your replies/slaughter.


Lars

Follow ups

References