← Back to team overview

dhis2-devs team mailing list archive

Re: On categories and dimensions and zooks

 

Hi,

2009/10/4 Lars Helge Øverland <larshelge@xxxxxxxxx>

>
> Big thanks to all for illuminating the pros and cons of the current
> multidimensional model. It was designed in 2006 basically to support the ICD
> based dataentry, and we must admit that Bob is at least partially right when
> saying that output could have been given better thought. Anyway it is not
> working out too bad either it seems.
>
> I like Bob's suggestion for simplifying the model and it would apparently
> made querying easier and improve the user interface. I have a few concerns:
>
> - Feasibility. The Category-related model is integrated into 9 out of 11
> service projects in DHIS 2. Re-factoring and testing all this would take
> months.
>
- Backwards compatibility. Lots of databases and data-entry forms exist in
> the field. Conversion must be managed.
>

I reached the same conclusion :-(.   I think there is still some small
rationalisation can be done, but the model is already deeply coupled with
many parts of the system.   Having said that I have a suggestion related to
the refactoring of dimensions and dataelementgroups below.


> - Suitability for the data-entry module. It seems likely that the
> CategoryCombo class can be "emulated" through the API.
>

Not sure what exactly what you mean by this .. but I guess probably.  I
suspect the work that most needs to be done on the CategoryCombo class in
the API is to provide "unpicking" methods to be able to conveniently access
the underlying categories (dimensions).


> - Does it cut tables to change from m-n to 1-n? Using join tables to
> represent 1-n associations is preferred by many as it keeps the domain model
> cleaner.
>

My proposal improved the situation by making a 1-n relation of category to
categoryOptions.  This would certainly be more efficient but doesn't meet
the use case where a categorOption might participate in different
categories.


>
> If people say we can live with the current model I'd say we do just that.
> Anyway Bob's suggestion should be documented and looked at again later. I
> think the point about "input without output is statistical m..." is valid.
> At least we will need to focus more on how to make "the goodness float up".
>

I think we can only know whether we can live with the current model once the
api methods which seem theoretically possible are implemented.  My concern
is that if we provide an alternative to MD analysis through extending the
groupset idea then we have no justification in recommending that
implementors implement MD dataelements.  Convenience of UI is not enough if
in the process we enter data which we can't unpack.  What will happen is
that implementors with an eye on analysis will ignore the MD notion entirely
because it creates difficulties for them and they have a ready analysis
solution with groups and groupsets.

>
> Re the data element / indicator group set I think this is something we can
> do without risk. It won't change the existing model and won't break anything
> and wouldn't take too long to implement. Will start on it on Wednesday. A
> minor comment here is that I believe we should keep the exclusiveness and
> compulsory-ness of the group set optional (..eh) like we have it for
> organisation unit group sets today.
>

Lars I think this is the correct response to what is clearly a very real
need.  But I want to suggest that we approach it as follows:

- We create two new abstract classes, Dimension and DimensionOption.
- DataElement should be extended with methods to retrieve Dimensions -
fold/unfold whatever the gathered requirements are.  These are the methods
which would be used in reportable design.
- Both Category and Group should in some way implement Dimension.  In both
cases I think the underlying structures, however imperfect, allows for this
symmetry.  If this is difficult for Categories initially we can throw
unImplemented() for now but we will have provided the structural guidance
towards harmonising the two.
- We might need a DimensionSet class or perhaps just a Set<Dimension>
getDimensions() member function of DataElement.

The point here is that if we have dimensions to a dataelement then from the
reporting/analysis perspective it can be made invisible how those dimensions
are implemented.  Instinctively I feel it should simply be possible to
retrieve datavalues from a dimension or crosstabs of dimensions.

One missing piece of the puzzle (or required symmetry) is that I don't think
currently we name a dataelement which has *beneath* it a dataElementGroup or
set of groups.  But I suspect this could be implemented relatively easily.

Whereas the above might look like it is complicating the picture I think in
fact it can considerably simplify it in the long run.  The correct starting
point is to gather the requirements of what methods a Dimension should
have.  If there were to be a Dimension class and we knew nothing of
implementation details, what would Jason and Ola and others really require
of that class.  Then we do the dirty work in the concrete implementations.
Otherwise known as the sweep-it-under-the-carpet pattern :-)  Or what others
might call encapsulation.

Regards
Bob


>
>
> Finally I hope people who are troubled about the lack of documentation
> would use Jason's instructions and convert some of this newly discovered
> wisdom into... documentation.
>
>
> cheers
>
> Lars
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References