← Back to team overview

dhis2-devs team mailing list archive

Re: On categories and dimensions and zooks

 

Hi,

Thanks for the explanations Jason. The multidimensional model is quite
complicated, is poorly documented, and as you say is DHIS-centric in the way
that it is built around the DHIS notion of a Data Element.

After the discussion with Jason yesterday, and also from the discussions we
all had here in this thread it became clear to me that we need to provide
multi-dimensionality to data elements (and also to indicators I think) that
are completely independent of data entry and data value storage.

Our current model of data element categories is really designed for
simplifying data entry and storage of large amounts of data elements with
the same dimensions, it grew out of the ICD-based forms in Ethiopia
remember. I still think there is a lot to gain by this model when it comes
to simplifying creation of large datasets, generating grid based data entry
forms, and in more effective persistence of data elements and values, but I
now I am also quite sure that this model alone is not enough to provide
flexible multidimensional data analysis. There are lots of use cases where
you would want to analyse data on dimensions that cut across data entry
forms and the typical dimensional data sets used for data collection, and in
stead of complicating the data entry and storage by adding more flexibility
on this model, I think we need to have this functionality independent of how
we store data values.

Data element groups provide some of this functionality and is meant to help
group, filter, and to some extent provide dimensionality to your data during
data analysis. Groups are completely independent of the data values and can
be modified at any time without changing the raw data in DataValue. A name
change to a category option or regrouping of options within a category
however, directly modifies the data values as well, so that model is more
fixed and targeting data collection and storage.

*Data Element Group Sets as dimensions and Data Elements as dimensional
elements
*Data element groups are currently a flat grouping structure ala assigning a
category option to a data element (without having the category). Just like
we have for orgunit groups (as Jason pointed out some days ago) we need
group sets for data elements as well. It is not a revolutionary thought, and
it has been on the wish list for many years, but the usage and real need for
it has not been clear (which is why it never was introduced in 1.4 either,
although also there it was on the design table at some point). See below for
an example of how this would play out.

Jason's requirements from Zambia, and these are not unique in any way, show
a real need for this and with a growing number of 1.4 + 2 hybrid set ups I
think this functionality will be more and more important. When applying DHIS
2 as a national or provincial web based system on top of many stand alone
1.4 installations, the category functionality, which is not supported in
1.4, becomes useless as data elements as data will be imported from 1.4. In
such hybrid setups the role of  DHIS 2 as a web based data warehouse + data
analysis tool (incl. GIS) is even stronger than in DHIS2 only setups, and we
need to provide multidimensional data analysis to this scenario.

That said, and I think Jason already has made a strong case for this, also
in a 100% DHIS2 scenario you will need more flexibility in defining
dimensions to your data than what categories can provide. Being able to
define data dimensions independent of data collection is powerful and should
be supported in a better way than what data element groups provide today.
Given that we already have the orgunit group set code in place I would
assume that adding group sets to data elements could be a relatively
straight forward thing to do (but then again, I am not the programmer...).

To add to my argument of separating data entry from analysis: the data
elements listed in my example below, although they look very streamlined
would come from at least 2 different data entry forms as outpatients  (OPD)
and in patients (IP) are treated in separate locations. The IP data elements
would most likely exist in a data set with other dimensions to it (like
admission, discharge, death) not applicable to OPD elements, and the OPD
data elements would most likely have (1st attendance, repeated visit) that
are not used in IP. The patient status dimension is not used for data entry
(as all data elements in one form would be either IP or OPD), and cuts
across data elements from different data sets and forms, while Age is reused
in data collection across different datasets.

In a DHIS 2 only setup where data element categories are used, a pivot table
could then pull the Age dimension from a data element category, but the
patient status would have to be a data element group set as it is not part
of any data set in particular (but reflects types of data across many of
them). The disease dimension is even more tricky as it would normally be
part of the data element name, but since DHIS collects and uses data
elements for more than just disease names, a separation out into a data
element goup set called "Disease" would simplify the data analysis.

This shows how complex the use cases are and I believe build an argument to
provide both categories and data element group sets.

I will put this into a blueprint (I believe the old request for this got
lost when Trac sunk a year ago...).
Sorry for the very long email, but to summarise:

*  DHIS 2 needs data element group set functionality as soon as possible, as
this will
- add more flexibility to dimensional data analysis in DHIS by separating it
from data entry and data value
- help all 1.4 + 2 set ups
- make Jason very happy

Ola
-------------

*And here is the example:
*The flat data element names:
"Malaria death <5 year"
"Malaria death >5 year"
"Malaria in OPD 1st attendance <5 year"
"Malaria in OPD 1st attendance >5 year"
"Malaria IP discharge <5 year"
"Malaria IP discharge >5 year"
"Typhoid death <5 year"
"Typhoid death >5 year"
etc.
(OPD is outpatient, patients treated at the clinic, IP is inpatient meaning
patients that was admitted to a hospital).

There are three dimensions in the data elements above, so I define three
data element group sets:
Disease, Patient Status, and Age.
I also define 7 new data element groups (Malaria, Typhoid, <5, >5, Death,
OPD, IP) and assign these groups to the group set they belong to:
Disease (Malaria, Typhoid)
Patient Status (Death, OPD, IP)
Age (<5, >5)

I then assign the data element groups to the data elements
"Malaria death <5 year" assigned to "Malaria", "Death", and "<5".
etc.

All these groupings can exist completely independent of data entry and be
changed at any time.
>From this I can generate a new resource table for my data analysis (similar
to the one we already have for orgunit group sets) that provides:
Data Element Group Set, Data Element Group, Data Element
"Disease", "Malaria", "Malaria death <5 year",
"Disease", "Typhoid", "Typhoid death <5 year"
"Patient Status", "Death", "Malaria death <5 year"
etc.

When joining the above table with an aggregated data value table you can
define a pivot table with your three data element group sets as columns
(pivot fields) and analyse the data across these three dimensions. The data
element name dimension can then be completely hidden in the analysis.

Ola
-----------

2009/9/29 Jason Pickering <jason.p.pickering@xxxxxxxxx>

> I think Ola is going to write up something on this as well, but I
> wanted to pre-empt him and offer some explanation and perhaps an
> apology, as I started all of this. Ola and I had a long and productive
> chat yesterday afternoon, where we went thought in detail the issue we
> have been discussing over the past few days. We have somewhat of a
> hybrid set-up here in Zambia, with DHIS 1.4 being used as the primary
> means of data collection in the districts. Data is then imported on a
> regular basis into DHIS 2, which contains additional data sets that
> are not part of the routine data collected with DHIS 1.4. As Ola has
> explained in detail to me now, the category options/combos
> (multidimensional data elements) that have been implemented in DHiS2
> are for a specific case when there is a "master" data element, such as
> Malaria cases, and multiple categories and options (Age, Patient
> status, etc) that make the data element multi-dimensional. I did not
> realize this when this discussion began, thinking that somehow, I
> could assign dimensionality to a plain-old data element (PODE??) with
> the categories tables.
>
> Our data elements have been created and imported directly from 1.4.
> DHIS 1.4 has a very flat model. Each data element must be defined
> separately, for each level of disaggregation (which have been included
> as examples I have provided through my mails).  The important thing
> that I realized today during our conversation is that I need to assign
> some dimensionality to my data elements, that is completely
> independent of cateogries/cateogry options. I have been trying to use
> the category combos/options to do this, but it was apparently not the
> intended use.
>
> I am going to continue to experiment with the categorycombos/options
> as there are still a few issues I am not comfortable with, but it the
> meantime, I guess we need to find another solution of how to assign
> dimensionality to non-multidimensional data elements (i.e. those
> without categories).
>
> Regards,
> Jason
>
>
>
>
> On Mon, Sep 28, 2009 at 3:00 PM, Jason Pickering
> <jason.p.pickering@xxxxxxxxx> wrote:
> > The link that you sent seems reasonable, but this is not what has been
> > implemented in DHIS 2 at this point as far as I can tell. In the
> > current model (at least through the UI), I have created two categories
> > (Age and Patient status).  I then created a categorycombo
> > Age_Patientstatus (Age and Patient status). I do this because
> > conceptually, Age is a dimension, and each of the member of the
> > category should correspond to a dimensional element (Under 1, 1-5 and
> > Over 5 for Age). This is necessary because in the SQL view, I need to
> > have a single column for each dimension, populated with the
> > appropriate dimensional elements.
> >
> > query did not work for me but looking at the source of the query, I
> > assume this is what is supposed to happen.
> >
> > If  I then assign these categorycombos to my data elements, I already
> > know it is not going to work, because I have no idea which one of the
> > categorycombooptions is applicable to a particular data element. I
> > suppose this is why I would need to create a categorycombo with
> > exactly one option in each category, which again, is not desired. Each
> > category should be able to have multiple options.
> >
> >
> > Now, I do see some light.
> >
> > Now, looking at the current database, when I generate the resource
> > table, this is what i get back in categoryoptioncombo name
> > 25270;25260;"(Over 5,IPD,)"
> > 25271;25261;"(Over 5,Deaths,)"
> > 25272;25262;"(Over 5,OPD,)"
> > 25273;25263;"(Under 1,IPD,)"
> > 25274;25264;"(Under 1,Deaths,)"
> > 25275;25265;"(Under 1,OPD,)"
> > 25277;25267;"(Age 1-5,Deaths,)"
> > 25278;25268;"(Age 1-5,OPD,)"
> > ;;""
> > 25276;25266;"(Age 1-5,IPD,)"
> >
> > Now, this looks very much like what I need in my PivotTable source
> > query, which I think is what the query that Johan just sent is
> > supposed to provide. (The query did not work, but I assume this is
> > what it is meant to happen).
> >
> > The problem is now, I have no idea (at least directly) that the first
> > set of values corresponds to Age and the second set of values
> > corresponds to Patient status.
> >
> >
> > If I could assign a data element a categoryoptioncomboid (the second
> > number in that result set above) instead of a categorycomboid (as is
> > the case now) I think I would be able to produce the result set that I
> > actually want. However, by assigning a data element the
> > categorycomboid , I can only tell which dimensions the data element
> > has, but not which particular dimensional elements it possesses.
> >
> > So, perhaps you are right that there is no need for any changes to the
> > data model, but rather the assignment that I mention above.
> >
> > Johan, thanks for the query. I will see if I can get it to work.
> >
> > Best regards,
> > Jason
> >
> >
> >
> >
> > On Fri, Sep 25, 2009 at 12:07 PM, Knut Staring <knutst@xxxxxxxxx> wrote:
> >>
> http://208.76.222.114/confluence/display/RandD/General+multi-dimensional+model
> >>
> >> On Fri, Sep 25, 2009 at 11:55 AM, Abyot Gizaw <abyota@xxxxxxxxx> wrote:
> >>>
> >>> The one-to-one relationship mentioned between dataelement and
> >>> categorycombo is not correct !
> >>>
> >>> The realtionship is one-to-many. A categorycombo can be assigned for
> many
> >>> dataelements. But a dataelement can have only one categorycombo.
> >>>
> >>> Thank you
> >>> Abyot.
> >>>
> >>> On Fri, Sep 25, 2009 at 11:44 AM, Jason Pickering
> >>> <jason.p.pickering@xxxxxxxxx> wrote:
> >>>>
> >>>> Hi there.
> >>>>
> >>>> My basic issue with the category/category combo is that it appears to
> be
> >>>> a one-to-one relationship with data elements. If I look at the data
> model,
> >>>> there is a one-to-one relationship between dataelement and
> categorycomboid.
> >>>> For a given category combo, you can have multiple options. So, you can
> >>>> establish a relationship for a given data element and a group of
> category
> >>>> options.
> >>>>
> >>>> Let me try and describe the issue. We have a set of data elements
> related
> >>>> to malaria for this example. We would like to be able to pivot the
> data on
> >>>> other dimensions dimensions (Data element, age, disease, patient
> status).
> >>>> Obviously there are other dimensions that are pivotable (orgunit,
> period,
> >>>> dataset)
> >>>>
> >>>> The data elements look like this. I have put the dimensions in square
> >>>> brackets, and the dimensional elements into curly brackets.
> >>>>
> >>>> [Data element, Age, Disease, Patient status]
> >>>> Deaths Confirmed Malaria total (composed of) {All ages, Malaria Cases,
> >>>> Deaths}
> >>>> Deaths Confirmed Malaria 1 to Under 5 Years  {1-5, Malaria Cases,
> Deaths}
> >>>> Deaths Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, Deaths}
> >>>> Deaths Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, Deaths}
> >>>> IP Discharge Confirmed Malaria total (composed of) {All ages, Malaria
> >>>> Cases, IP}
> >>>> IP Discharge Confirmed Malaria 1 to Under 5 Years {1-5, Malaria Cases,
> >>>> IP}
> >>>> IP Discharge Confirmed Malaria Over 5 Years {Over 5, Malaria Cases,
> >>>> Deaths}
> >>>> IP Discharge Confirmed Malaria Under 1 Year {Under 1, Malaria Cases,
> >>>> Deaths}
> >>>> OPD 1st Attendance Confirmed Malaria total (composed of) {All ages,
> >>>> Malaria Cases, OPD}
> >>>> OPD 1st Attendance Confirmed Malaria 1 to Under 5 Years {1-5, Malaria
> >>>> Cases, OPD}
> >>>> OPD 1st Attendance Confirmed Malaria Over 5 Years {Over 5, Malaria
> Cases,
> >>>> OPD}
> >>>> OPD 1st Attendance Confirmed Malaria Under 1 Year {Under 1, Malaria
> >>>> Cases, OPD}
> >>>>
> >>>> OK,  I hope this is pretty clear. Obviously, there are more data
> elements
> >>>> (Typhoid, Yellow fever, etc). I might want to know how many Under 1
> deaths I
> >>>> have had for all diseases, or how many OPD cases I have had for each
> >>>> disease. How can I do this with the existing data model? It is not
> obvious
> >>>> to me because there is no relationship between dimensional elements
> >>>> (categoryoptions) to each other. Category options can be related
> through a
> >>>> cateogry combination, but since data elements can only be assigned a
> single
> >>>> category option, the dimensionality is broken once it gets time to
> pull the
> >>>> data into a pivot table.
> >>>>
> >>>>  In the incomplete example that I gave yesterday, I established a
> >>>> one-to-many relationship between a data element and a dimension. If I
> >>>> understand the current data model, I would have to create a separate
> >>>> categorycombo  for each of these data elements, and assign this
> >>>> categorycombo to the data element. Now, I might be able to unfold the
> >>>> dimensions using the categories and categorycombos. I it is not
> apparent how
> >>>> the dimensional elements correspond themselves to a particular
> dimension, as
> >>>> there is no relation for this in the database as I can see it.
> >>>>
> >>>>  As (Johan pointed out a few mails ago, if I understand him correctly)
> >>>> is  different categorycombo's can be created for individual data
> elements,
> >>>> and assigned to these elements. However, this seems to be 1)
> incredibly
> >>>> inefficient and 2) does not establish any relationship between
> dimensional
> >>>> elements and dimensions. Perhaps it is there, and maybe it has been
> done in
> >>>> SL, but the SQL is not apparent to me at all.
> >>>>
> >>>>  It would appear to me, looking from an SQL perspective, that a
> >>>> one-to-many relationship between a data element, a dimension
> (category) and
> >>>> dimensional element (category combo) would be much more effieicnet,
> and
> >>>> highly usable from an SQL perspective. As I mentioned in my mail, I am
> not
> >>>> sure how easy this would be to implement in a procedural language like
> Java,
> >>>> but I assume it should be possible to either do it this way, or
> rewrite my
> >>>> Postgres proprietary query in standard SQL (which there are ways to do
> with
> >>>> ANSI SQL).  This would require modification to the data model (similar
> to
> >>>> the table I provided yesterday) and modification to the UI to allow
> users to
> >>>> 1) select a dimension (category) 2) Select a dimensional element for
> the
> >>>> given dimension.  This would populate the table with a dataelementid,
> a
> >>>> dimensionid (categoryid) and a dimensional element (cateogryoptionid).
> >>>>
> >>>> My gut feeling this is exactly the same functionality as has currently
> >>>> been implemented for organizational units. Users can define a
> hierarchy for
> >>>> organizational units, and then assign them to
> >>>> categories/dimension/organizational group sets, decide whether the
> groups
> >>>> are compulsory and exlusive, and then assign a particular
> organizational
> >>>> unit to a particular group (which is analogous to a dimensional
> element).
> >>>> Organizational group sets define the dimension, and one-to-one
> assignment of
> >>>> an organizational unit to a particular organizational group defines
> which
> >>>> dimensional element the organizational unit is a member of. These
> dimensions
> >>>> can then be used in PivotTable analyses, where the orgunitgroupsets
> become
> >>>> dimensions, and orgunitgroups become dimensional elements..
> >>>>
> >>>> I beleive that data elements are no different than organizational
> units.
> >>>> They should be able to be grouped into some sort of hierarchy and
> pivoted on
> >>>> any dimension.  Data elements groups establish a one-to-many
> relationship
> >>>> between data elements and a data element group, but there is no
> concept of
> >>>> how data element groups relate to each other.
> >>>> I think this is perhaps the same concept you mention, ReportSet.
> >>>>
> >>>> I suspect we would need to potentially rethink the entire concept of
> >>>> multidimensionality if we really wanted to get it right. It would see
> to me
> >>>> that the DHIS datamodel and associated aggregation methods have been
> >>>> hardwired into aggregation across time (period) and geography
> (orgunit).
> >>>> What we can do with PivotTables and (and OLAP) is to aggregate across
> any
> >>>> possible dimension, slicing as you mention ,on any dimension . I am
> not sure
> >>>> this will be so simple to implement but I think there is a way to do
> it,
> >>>> without major modifications.
> >>>>
> >>>> I am not sure it solves the SDMX issue. There are potential issues
> >>>> related to "ragged" dimensions and how these get handled. Some data
> elements
> >>>> might have three dimensions, while others may have more. I have not
> thought
> >>>> about this in detail, but know it is an issue with cross-tab queries
> in SQL.
> >>>> You normally have to know how many dimensions you are working with in
> order
> >>>> to perform a cross-tab, but there are dynamic solutions. Perhaps this
> could
> >>>> be dealt with somehow in SDMX.
> >>>>
> >>>> Anyway ,I am rambling. Hope this mail helps though to push my point
> >>>> further. Once I get the SQL from SL, I will see if perhaps it has been
> done
> >>>> already, and that I am just writing long emails for nothing. :)
> >>>>
> >>>> Regards,
> >>>> Jason
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Sep 25, 2009 at 10:44 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx
> >
> >>>> wrote:
> >>>> > Hi Jason and Johan
> >>>> >
> >>>> > I'm really pleased to see you having this discussion as I have been
> >>>> > grappling with a similar issue which involves unravelling
> categories,
> >>>> > category options and combos into something more familiar.  I have
> >>>> > reached
> >>>> > similar conclusions regarding nomenclature:
> >>>> >
> >>>> > category = dimension
> >>>> > categorycombo - I have been calling a dimension set (it bears a
> strong,
> >>>> > and
> >>>> > useful, resemblance to xslt:attribute-set)
> >>>> > category option - I like your suggestion of DimensionalElement.  I
> am
> >>>> > going
> >>>> > to start calling it that too.
> >>>> >
> >>>> > In my case I need to export (and import data) into a standard format
> >>>> > called
> >>>> > sdmx.  So whereas in the DHIS2 native DXF we export datavalues with
> >>>> > effectively three dimensions (source, period, categorycombooption)
> the
> >>>> > last
> >>>> > dimension is a sort of uber-dimension.  Like a peppercorn or a
> cardamon
> >>>> > seed, when you break it open it explodes its rich complexity of
> >>>> > dimensions.
> >>>> >
> >>>> > In sdmx we need the dimensions exploded.  So data values look like:
> >>>> >
> >>>> > <dataset>
> >>>> >   <datavalue name="TB test given" uid="44344 ...44" gender="Male"
> >>>> > age="0-5"
> >>>> > value="32" />
> >>>> >    <datavalue name="TB test given" uid="44344 ...44" gender="Female"
> >>>> > age="0-5" value="38" />
> >>>> > ..
> >>>> > </dataset>
> >>>> >
> >>>> > My approach to unpicking the dimensions from the dxf file is to
> >>>> > transform it
> >>>> > with an xslt transformation which is still incomplete but seems to
> work
> >>>> > well.
> >>>> >
> >>>> > One other nomenclatures issue which has surfaced as a result is what
> we
> >>>> > call
> >>>> > a "dataset".  In DHIS2, if I understand correctly, a dataset
> >>>> > corresponds
> >>>> > roughly to all the dataelements which might occur on a
> datacollection
> >>>> > form.
> >>>> > If we view all dataelements as having just the three "dimensions"
> then
> >>>> > all
> >>>> > is well, but if we explode the actual dimensions then we have an
> >>>> > issue.  In
> >>>> > the sdmx model a dataset consists only of dataelements with the same
> >>>> > dimensionset.  After discussing this with Ola we have reached the
> >>>> > conclusion
> >>>> > that we need another level of grouping, primarily for the UI - eg
> >>>> > FormSet or
> >>>> > ReportSet which allows us to group related datasets.  But that is an
> >>>> > aside
> >>>> > from what you are talking about.
> >>>> >
> >>>> > I know that you guys can do magic with sql, but it seems that we
> should
> >>>> > try
> >>>> > to capture some of this and place it down in the datamodel API.  It
> >>>> > occurs
> >>>> > to me that for a multidimensional dataelement we might benefit from
> >>>> > some
> >>>> > utility methods to retrieve slices and dices which might assist in
> >>>> > constructing the pivot tables around dimensions.  Does this sound
> like
> >>>> > the
> >>>> > right thing to do.
> >>>> >
> >>>> > Regards
> >>>> > Bob
> >>>> >
> >>>> > 2009/9/24 <johansa@xxxxxxxxxx>
> >>>> >>
> >>>> >> Jason,
> >>>> >> I will leave to others to comment the code, but I have a few
> >>>> >> comments...
> >>>> >>
> >>>> >> > I have done a bit more thinking on this, and would like to offer
> >>>> >> > some
> >>>> >> > more examples up for discussion.
> >>>> >> >
> >>>> >> > Basically, we have a lot of data elements that are somehow
> related
> >>>> >> > to
> >>>> >> > each other, similar to my kooky example in my original mail. I
> >>>> >> > assume
> >>>> >> > this is fairly common throughout other HMIS systems. Here,
>  malaria
> >>>> >> > attendance is broken down into various dimensions/category by
> >>>> >> > patient
> >>>> >> > type (outpatient, inpatient, and deaths) and by age (under 1 ,1-5
> >>>> >> > and
> >>>> >> > over 5). But say you want to be able to pivot to look at
> outpatient,
> >>>> >> > inpatient and deaths totals (i.e. summed up by age). Well, you
> could
> >>>> >> > create a separate data element for this, but it sure would be
> nice
> >>>> >> > to
> >>>> >> > be able to Pivot the data somehow.
> >>>> >>
> >>>> >> In the Sierra Leone db, Edem and Romain set up views that pulled
> the
> >>>> >> categories through into a "Category" pivot field, which you can
> then
> >>>> >> use
> >>>> >> to get what you want. Simply tick the categories (see below) you
> want
> >>>> >> to
> >>>> >> see, and group them together in excel. Maybe Edem and Romain can
> help
> >>>> >> further here.
> >>>> >>
> >>>> >>
> >>>> >> > Dimension ? Category
> >>>> >> > Dimensional element ? Category option ? Category combo ( I think)
> >>>> >>
> >>>> >> The right symbol disappeared from my reply-mail here, but some
> >>>> >> clarification:
> >>>> >>
> >>>> >> Crosstab Dimension (age AND gender) = Category combo
> >>>> >> Dimension (age, gender) = Category
> >>>> >> Dimensional element (inpatient, outpatient, death, under1, 1-5, and
> >>>> >> over
> >>>> >> 5) = Category option
> >>>> >>
> >>>> >> So by assigning a DE the category combo of "gender_age", you get 9
> >>>> >> dimensional elements, 3 category options (in category age) by 3
> >>>> >> category
> >>>> >> options (in category gender)
> >>>> >>
> >>>> >> Johan
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> > Anyway, here is the helper table I created.
> >>>> >> >
> >>>> >> > CREATE TABLE test_dataelementcategorycombo
> >>>> >> > (
> >>>> >> >   test_dataelementid integer NOT NULL,
> >>>> >> >   test_dataelementcategoryid integer NOT NULL,
> >>>> >> >   test_dataelementcategorycomboid integer NOT NULL,
> >>>> >> >   CONSTRAINT pk_testdataelementcategory PRIMARY KEY
> >>>> >> > (test_dataelementid, test_dataelementcategoryid,
> >>>> >> > test_dataelementcategorycomboid)
> >>>> >> > )
> >>>> >> > WITH (OIDS=FALSE);
> >>>> >> >
> >>>> >> > So this is  a real simple table which references a data element,
> a
> >>>> >> > data element category, and a data element combo. The reference to
> a
> >>>> >> > data element category may be redundant, but anyway, lets leave it
> in
> >>>> >> > for now.
> >>>> >> >
> >>>> >> > I populated the table with some data, which will be used to
> assign
> >>>> >> > dimensions to data elements. It looks like this in my DB, which
> >>>> >> > looks
> >>>> >> > like this.
> >>>> >> >
> >>>> >> > 309;25250;25251
> >>>> >> > 309;25257;25255
> >>>> >> > 348;25250;25252
> >>>> >> > 348;25257;25255
> >>>> >> > 455;25250;25253
> >>>> >> > 455;25257;25255
> >>>> >> >
> >>>> >> > but of course this is meaningless to you. What do these values
> >>>> >> > correspond
> >>>> >> > to?
> >>>> >> >
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria Under 1
> >>>> >> > Year";"Age";"Under
> >>>> >> > 1"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5
> >>>> >> > Years";"Age";"Age 1-5"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria Over 5
> >>>> >> > Years";"Age";"Over
> >>>> >> > 5"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria Under 1
> Year";"Patient
> >>>> >> > status";"OPD"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5
> >>>> >> > Years";"Patient status";"OPD"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria Over 5
> Years";"Patient
> >>>> >> > status";"OPD"
> >>>> >> >
> >>>> >> > which can be produced by the following view.
> >>>> >> >
> >>>> >> > CREATE OR REPLACE VIEW vw_dataelements_dimensions AS
> >>>> >> >  SELECT dataelement.name, dataelementcategory.name AS dimension,
> >>>> >> > dataelementcategoryoption.name AS dimension_element
> >>>> >> >    FROM dataelement
> >>>> >> >    JOIN test_dataelementcategorycombo ON
> >>>> >> > test_dataelementcategorycombo.test_dataelementid =
> >>>> >> > dataelement.dataelementid
> >>>> >> >    JOIN dataelementcategory ON dataelementcategory.categoryid =
> >>>> >> > test_dataelementcategorycombo.test_dataelementcategoryid
> >>>> >> >    JOIN dataelementcategoryoption ON
> >>>> >> > test_dataelementcategorycombo.test_dataelementcategorycomboid =
> >>>> >> > dataelementcategoryoption.categoryoptionid;
> >>>> >> >
> >>>> >> > So, that view just provides a human readable view of those
> integers
> >>>> >> > that I populated in the the test_dataelementcategorycombo table I
> >>>> >> > created above.  This table just assigns particular data elements
> to
> >>>> >> > different category options (dimensional elements).
> >>>> >> >
> >>>> >> > OK, so far so good, but the problem now is, how to use this with
> the
> >>>> >> > aggregatedatavalue table? If we try and join this table directly,
> we
> >>>> >> > will have issues with duplicates in the pivot table, so we need
> to
> >>>> >> > transform the data slightly.
> >>>> >> >
> >>>> >> > This should do the trick.
> >>>> >> >
> >>>> >> > SELECT * FROM crosstab
> >>>> >> >       (
> >>>> >> > 'SELECT name, dimension, dimension_element FROM
> >>>> >> > vw_dataelements_dimensions ORDER BY 1,2,3',
> >>>> >> > 'SELECT DISTINCT dimension from vw_dataelements_dimensions ORDER
> BY
> >>>> >> > 1
> >>>> >> > ASC'
> >>>> >> >       )
> >>>> >> > as
> >>>> >> > (
> >>>> >> > name character varying(230),
> >>>> >> > age character varying(160),
> >>>> >> > status character varying(160)
> >>>> >> > );
> >>>> >> >
> >>>> >> >
> >>>> >> > which returns this record set
> >>>> >> >
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5
> >>>> >> > Years";"Age
> >>>> >> > 1-5";"OPD"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 Years";"Over
> >>>> >> > 5";"OPD"
> >>>> >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 Year";"Under
> >>>> >> > 1";"OPD"
> >>>> >> >
> >>>> >> >
> >>>> >> > OK, admittedly, I cheated a bit and used the crosstab function of
> >>>> >> > Postgresql, but I assume that this query could be rewritten with
> a
> >>>> >> > few
> >>>> >> > more lines of code in standard SQL or some procedural language
> like
> >>>> >> > Java. Now, this record set looks like something that I can almost
> >>>> >> > use
> >>>> >> > with the aggregateddatavalue table simply by joining up the table
> on
> >>>> >> > the appropriate dataelementid and pulling everything into a pivot
> >>>> >> > table.  I would not have any duplicated values and would have
> >>>> >> > columns
> >>>> >> > like data element name, period, orgunit, age, patient status and
> of
> >>>> >> > course the value of the data element.  I hope that part is pretty
> >>>> >> > clear. Just join up that table to the aggregateddata table, and
> you
> >>>> >> > have pretty much what is needed to pull the data directly into a
> >>>> >> > PivotTable for further analysis.
> >>>> >> >
> >>>> >> > This is not a complete example, but it is very close to what I
> need
> >>>> >> > here ,and I think this type of functionality would be much more
> >>>> >> > useful
> >>>> >> > than the current data element categories functionality.
> Basically,
> >>>> >> > all
> >>>> >> > that would be required, at least initially, would be another user
> >>>> >> > interface screen to allow the definition of which category(ies)
> and
> >>>> >> > category options a data element is a member of. The rest could
> ,in
> >>>> >> > the
> >>>> >> > first instance be executed with custom SQL (obviously, I am
> partial
> >>>> >> > to
> >>>> >> > this language and hobbled by the fact that I do not know Java),
> but
> >>>> >> > eventually this would need to be implemented somehow in Java.
> >>>> >> >
> >>>> >> >  I am not sure if this really solves all of the issues
> surrounding
> >>>> >> > multidimensional analysis of data elements, but it seems to solve
> >>>> >> > the
> >>>> >> > issues that I am having by trying to assign some sort of
> dimensional
> >>>> >> > hierarchy to data elements (similar to the exclusive/compulsory
> >>>> >> > functionality of orgunits). Any thoughts on this?
> >>>> >> >
> >>>> >> > Best regards,
> >>>> >> > Jason
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> > On Wed, Sep 16, 2009 at 10:28 PM, Jason Pickering
> >>>> >> > <jason.p.pickering@xxxxxxxxx> wrote:
> >>>> >> >>
> >>>> >> >>
> >>>> >> >> On Wed, Sep 16, 2009 at 10:13 PM, <johansa@xxxxxxxxxx> wrote:
> >>>> >> >>>
> >>>> >> >>> >> However, there does seem to be the ability to assign
> >>>> >> >>> >> dimensions,
> >>>> >> >>> there
> >>>> >> >>> >> does
> >>>> >> >>> >> not seem to be the ability to assign particular elements
> within
> >>>> >> >>> those
> >>>> >> >>> >> dimensions to a particular DHIS data element.
> >>>> >> >>>
> >>>> >> >>>
> >>>> >> >>> Just some more clarification here: you can make category combos
> >>>> >> >>> which
> >>>> >> >>> you
> >>>> >> >>> assign to data elements. However, it is not possible to assign
> >>>> >> >>> just
> >>>> >> >>> specific parts of a category combo (only some of the category
> >>>> >> >>> options)
> >>>> >> >>> to
> >>>> >> >>> a data element.
> >>>> >> >>
> >>>> >> >> Yes, this was exactly what I wanted. Assigning different
> categories
> >>>> >> >> would
> >>>> >> >> seem to break the dimensionality.
> >>>> >> >>
> >>>> >> >>>
> >>>> >> >>> Then you must make a specific category (as the only one in
> >>>> >> >>> or part of a new category combo) with just those options. It
> can
> >>>> >> >>> be
> >>>> >> >>> hell;
> >>>> >> >>> in Tajikistan there were way over 20 categories I think, at
> least
> >>>> >> >>> 10
> >>>> >> >>> just
> >>>> >> >>> on various age groups.
> >>>> >> >>>
> >>>> >> >>> Johan
> >>>> >> >>>
> >>>> >> >>
> >>>> >> >> This was my fear.
> >>>> >> >>
> >>>> >> >> I will need to do some testing and see. I still fear it is not
> >>>> >> >> exactly
> >>>> >> >> the
> >>>> >> >> intended functionality.
> >>>> >> >>
> >>>> >> >> Basically, I think I need something akin to the
> >>>> >> >> exclusive/compulsory
> >>>> >> >> groups
> >>>> >> >> that are in place for organizational units, but instead, for
> >>>> >> >> arbitrary
> >>>> >> >> dimensions. I will give a try and see what happens.
> >>>> >> >>
> >>>> >> >> Thanks,
> >>>> >> >> Jason
> >>>> >> >>
> >>>> >> >>
> >>>> >> >>
> >>>> >> >>
> >>>> >> >>
> >>>> >> >
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> _______________________________________________
> >>>> >> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>>> >> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> >>>> >> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>>> >> More help   : https://help.launchpad.net/ListHelp
> >>>> >
> >>>> >
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> >>>> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>>> More help   : https://help.launchpad.net/ListHelp
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> >>> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>> More help   : https://help.launchpad.net/ListHelp
> >>>
> >>
> >>
> >>
> >> --
> >> Cheers,
> >> Knut Staring
> >>
> >
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References