← Back to team overview

dhis2-devs team mailing list archive

Re: [Dhis2-users] Creation of CategoryOptionCombinations

 

Hi Uwe,


cat option combo names: Currently the max length is 255 characters which I
guess can be a problem when you have a high number of dimensions. I have
changed it to text type (unlimited length) now in trunk and 2.23. Please
try again now. Note that to produce 50m objects you will need significant
memory for the JVM.


I am not sure if removing the name will make it less heavy or help, and
people have been asking for stable names for option combos for a long time,
so I think we will leave them where they are.

I do understand the problem with a high number of disaggs, and 10
dimensions is not unreasonable. I see you are producing the data from a SQL
group by on your "case" / transactional data, and is therefore aggregate
data. That said, the fact table ("data value" solution) in DHIS 2 is not
really meant to cater for extremely high number of disaggregations of the
same data elements, as we predefine possible disaggregations through the
category option combos. So going with the event model (as Jason suggests)
could be more appropriate for this type of data which is aggregated but
still is very fine-grained. That means you can use the Event reports /
Event visualizer apps to analyze your data (and use the
/api/analytics/event resource from an API perspective).

As a work-around you could in fact define two category combinations, as
data values are linked to both data elements and data sets (category option
combination and attribute option combination). So you take 5 of your
dimensions and create one (data element) category combo, then 5 or your
dimensions and create one (data set) category combo. You now need to link
your data values to two different option combos which is a bit
inconvenient, but it will probably solve your immediate need, as the number
of option combos goes from (opt = category options)

opt ^ 10

to

(opt ^ 5) * 2

That said, this is not a scalable solution, and using the event model might
be more appropriate.

regards,

Lars





On Wed, Jun 8, 2016 at 5:22 PM, Jason Pickering <jason.p.pickering@xxxxxxxxx
> wrote:

> I am not talking about tracker, but rather anonymous events. So, again, I
> have no idea what your data looks like, but I will take a stab.
>
> Age: As an integer or  if you have it, the date of birth
> Gender: As an option set (Male/Female)
> JobGroup: As an option set
> Insurance scheme: As an option set
> Weight: As an integer, I guess...
> Size: ??
> FeesPaid: As numeric
>
> The advantage as representing this as events is that Age, Gender, Job
> Group, Insurance scheme can be used to aggregate "FeesPaid" in the event
> reports, but without explicitly defining the dimensions. Thus you only
> create the dimensions (and database index size) you actually need, and
> don't end up with  many empty cat option combos, but rather can simply
> count the events across those dimensions in the event reports.
>
> Again, no idea what you data looks like, it just seems that maybe you are
> choosing a difficult way to represent the data, especially, if you are
> going to end up with a lot of cat option combos which don't have any data.
>
> Regards,
> jason
>
>

References