← Back to team overview

dhis2-devs team mailing list archive

Re: [Dhis-dev] DataElement -> PeriodType association

 

On 20 May 2010 18:39, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:

> On 20 May 2010 15:56, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
> > 2010/5/20 Ola Hodne Titlestad <olatitle@xxxxxxxxx>:
> >>
> >> 2010/5/20 Lars Helge Øverland <larshelge@xxxxxxxxx>
> >>>
> >>> Data elements derive their period type from the data sets they are
> members
> >>> of.
> >
> > Restated (what I just sent Lars only by mistake):  a datavalue derives
> > its period type from the data set of
> > which its data element is a member  :-)
> >
> >>
> >> And when they are members of two datasets with different period types
> they
> >> have multiple period types right?
> >
> > It's important to remain aware that it is values ultimately which have
> > periods (and hence period types).
> >
> > And when you look at a value you can derive its period type in one of
> > two ways - via dataset or via period.  Potentially these could
> > disagree,  The one which derives from its period should be considered
> > authoritative ie. if the period is 2009-Jan then regardless of what
> > the dataset might say this really must be monthly.  Of course we hope
> > these always agree.  Incidentally the lookup from
> > datelement-to-dataset-to-period looks like a greater complexity than
> > the lookup from period->periodType.
> >
> >>
> >> The key thing to look out for in data entry and data import is to avoid
> >> overlaps in data values that will cause duplication when aggregating
> data
> >> periods.
> >> E.g. if the SAME ORGUNIT registers values for the same data element for
> two
> >> different period types that have overlapping periods, e.g. Jan-10 and
> Q1-10.
> >> Then the aggregate values for Q1-10, Jan-June 2010, and 2010 will all
> show
> >> an incorrect value since the value for Jan-10 is counted twice.
> >
> > OK.  Thats a good concrete constraint to have.
> >
> >>
> >> One way to enforce this constraint is to monitor which datasets an
> orgunit
> >> is assigned to, and not allow orgunits to be assigned to two datasets
> that
> >> have the same data element AND different period types.
> >
> > Agreed,  Though this constraint should probably be imposed on forms
> > rather than datasets.
> >
> >>As far as I am aware,
> >> we are not checking for this today. During data import it could be
> checked
> >> on data element level by looking up the period type the way Bob has
> shown,
> >> but that sounds like a lot of look ups and time consuming validation,
> or?
> >
> > On data import we don't really validate at all, beyond whatever
> > constraints the db imposes. For efficiency we simply pop the values in
> > with multiple insert statement.  So this validation would have to
> > happen as a stage before the actual import or would have to be
> > constrained within the db.  In fact it can't be validated easily
> > before the import as it is dependent on existing values within the db.
> >
> >>
> >> A relatively normal use case that we probably have to find a way to
> support,
> >> and I think they are struggling with in Vietnam, is that different
> provinces
> >> can use different period types for the same data elements (even for
> complete
> >> data sets). E.g. if the national data flow policy says to report on
> >> immunisation data every quarter, so that becomes the minimum requirement
> for
> >> all provinces. Then some of the provinces decide that all their
> facilities
> >> have to collect this data monthly anyway, and then at the province level
> >> they simply send the quarterly aggregates to national level (in the
> >> paper-based or Excel world). At the same time other provinces just
> collect
> >> quarterly data at the facility level as in the minimum national
> requirement.
> >> At the national level there is a need to consolidate all this data, even
> >> data by the facility level, so ideally a national DHIS database should
> be
> >> able to store both monthly and quarterly raw data values for the same
> data
> >> elements, but for different orgunits. The national information users can
> >> then easily generate quarterly reports on immunisation for all
> provinces,
> >> while in some provinces they can do monthly data analysis if they want
> to
> >> collect data using that frequency.
> >>
> >> We support the above scenario by allowing the same data elements to be
> >> assigned to different data sets with different period types, but we
> don't
> >> control for misuse of this flexibility which can lead to duplication and
> >> inconsistent aggregated data values as pointed out above.
> >
> > Thinking further ... I really think the problem arises because we we
> > have a dataset concept which represents a form and is also used to
> > constrain periodtypes on dataelements.  Thinking of the use case you
> > have just described, it should be the case that one can have a paper
> > form which national level expect to collect quarterly, and the same
> > form be used at a lower level to collect data monthly.  If we wanted
> > to mirror that use case electronically we would have to divorce the
> > form from the periodtype - ie a form would collect datavalues of a
> > certain period, but the same form could be used in different orgunits
> > for collecting data at a different frequency..
> >
> > So (leaving dataset aside for the moment) if we can't assign a
> > periodtype to a form and we can't assign to a dataelement and its too
> > inefficient to validate on a one by one datavalue basis what is a girl
> > to do?
> >
> > I suspect the correct answer is to refactor datavalue and create a
> > datavalueset type - note: a set of datavalues rather than a set of
> > dataelements.  Designing out loud, a datavalueset would have the
> > following fields/attributes:
> >
> > 1.  a formid - the collection instrument used - roughly corresponds to
> > current dataset
> > 2.  an orgunitid - where the datavalues come from
> > 3.  a periodid - the period of all the datavalues
> > couple of other useful attributes I can think of
> >
> > Datavalue now becomes slightly simpler (which is always a good thing).
> >  It only has:
> > value, dataelementid, categorycombooption, datasetid
>
> Afterthought:
> At the risk of adding complexity to what is otherwise a
> simplification, my life could become even simpler if datavalueset also
> had a categorycombo attribute, which would imply that a dataset was
> linked to a formsectionid rather than a formid.
>
> So a form has sections.  sections have dataelements.  And sections
> have a datavalueset as a model - which implies a uniform categorycombo
> within the section.
>
> There isn't really a need for dataelements to have a categorycombo.
> And in lots of ways its good that they don't. Then I am reducing
> complexity rather than adding to it :-)
>
> Consider one orgunit has collected malaria deaths disaggregated by
> age.  Another has collected values for the the same dataelement, but
> not disaggregated by age.  The datavalues will come from a
> datavalueset so will have a categorycombo.  It is possible to
> aggregate or compare these datavalues,from different datavaluesets,
> but using the lowest common denominator of categorycombo ie. in both
> cases you have access to malaria deaths - in the one case you have to
> "roll-up" the categorycombo which does of course assume that the sum
> of category options make a sensible whole, but Ola has mentioned this
> one many times.
>
>
Some really interesting ideas you are bringing up here Bob. I like the kind
of flexibility and yet structure this would bring to the data model.

One quick question though:
How would this fit with the use of data elements and categorycombooptions in
metadata expressions like indicators and validation rules that are (and
should be) completely independent from data collection structures? E.g.
which categories and options should be available for a given data element
when setting up an indicator formula? All?

Ola
--------





> Regards
> Bob
>
> >
> > We can relatively efficiently validate that a dataset object is not
> > persisted which has the same formid, orgunitid and an overlapping
> > period.
> >
> > There is no longer any ambiguity about periodtype of a datavalue.
> >
> > stored_by, timestamp, comment might go either way.  Probably they need
> > to stay on datavalue.  I notice comment is rarely used but its really
> > useful to have a comment on datavalueset for import purposes.
> >
> > 'nuff designing out loud. Got to go.
> >
> > Regards
> > Bob
> >
> >>
> >>
> >> Ola
> >> ---------
> >>
> >>>
> >>> On Thu, May 20, 2010 at 11:44 AM, Ola Hodne Titlestad <
> olatitle@xxxxxxxxx>
> >>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> After Kim Anh's email about the use of the same data elements with
> >>>> different period types I dug up this old discussion from March 2009.
> >>>>
> >>>> What is the status on this work, or did we not conclude this?
> >>>>
> >>>> Ola
> >>>> ----------
> >>>>
> >>>> 2009/3/20 Bob Jolliffe <bobjolliffe@xxxxxxxxx>
> >>>>>
> >>>>> 2009/3/20 Lars Helge Øverland <larshelge@xxxxxxxxx>:
> >>>>> >
> >>>>> >>
> >>>>> >> Yes this is true.  But what do you think of the idea to enforce
> >>>>> >> DataSet membership having a default DataSet for all the
> delinquents?
> >>>>> >> I'm not sure if it can be enforced by the schema, but at least by
> the
> >>>>> >> application.
> >>>>> >
> >>>>> > OK but what does this give us in terms of PeriodType-determining if
> >>>>> > this
> >>>>> > default DataSet has a null PeriodType?
> >>>>>
> >>>>> Nothing really.  The only effect would be you have an index on the
> >>>>> unassigned DataElements for what its worth.  Mainly it would be
> useful
> >>>>> for determining easily the available DataElements which can be added
> >>>>> to a DataSet.  Maybe its a nonsense idea - I was just trying to think
> >>>>> of ways to make editing DataSets reasonably straightforward.
> >>>>>
> >>>>> >
> >>>>> >>
> >>>>> >> I don't know if its about right or wrong.  There are pros and cons
> of
> >>>>> >> both approaches.  What you gain on the swings you lose on the
> >>>>> >> roundabouts :-)
> >>>>> >>
> >>>>> >> In the explicit case the application will have to enforce that
> >>>>> >> DataSet
> >>>>> >> members all have the same periodType.
> >>>>> >>
> >>>>> >> In the implicit case the application will have to enforce that
> >>>>> >> DataElements can only be members of multiple groups if these share
> >>>>> >> the
> >>>>> >> same PeriodType.
> >>>>> >>
> >>>>> >> The net result as far as the Data API is concerned can and must be
> >>>>> >> the
> >>>>> >> same.  Perhaps we should define exactly what extra methods we want
> in
> >>>>> >> the API first.  We have already identified a few.  Then decide
> >>>>> >> whether
> >>>>> >> a database change is necessitated by these.
> >>>>> >
> >>>>> > Yes. We need at least service method:
> >>>>> >
> >>>>> > Collection<DataElement> getDataElementsByPeriodType( PeriodType )
> >>>>> >
> >>>>> > and getter on the DataElement object:
> >>>>> >
> >>>>> > PeriodType getPeriodType()
> >>>>> >
> >>>>> >
> >>>>> > I guess we could make a branch, start coding and see how it works
> out.
> >>>>>
> >>>>> Sure.  So long as we are adding methods we won't be breaking anything
> >>>>> in terms of backward compatibility.  Just enforcing application level
> >>>>> constraints.  Then we can really encourage (enforce?) upper layers to
> >>>>> strictly interact with the data via the API.  Even if this might
> >>>>> occasionally mean making some lightweight API methods which bypass
> the
> >>>>> ORM.
> >>>>>
> >>>>> >
> >>>>> > Another issue would arise in the (exotic) situation where someone
> >>>>> > assigns a
> >>>>> > DataElement to a DataSet, enter data for it, then removes it from
> the
> >>>>> > DataElement. The data is there, but how do we deal with it in
> regard
> >>>>> > to the
> >>>>> > mentioned required functionaly (trend analysis, datamart) ?
> >>>>> >
> >>>>>
> >>>>> Yes this gets a bit weird (I presume you mean removes it from the
> >>>>> DataSet).  I'm guessing you haven't lost the data because the
> >>>>> dataValues each have a PeriodID which in turn is linked to a
> >>>>> PeriodType.  I suppose that (in such an exotic headspace)
> DataElements
> >>>>> can in fact change their PeriodTypes over time, though I imagine its
> >>>>> not a great idea.
> >>>>>
> >>>>> The effect would be the same in the explicit relationship case, if
> >>>>> someone assigns a DataElement to a DataSet, enter data for it, then
> >>>>> changes the PeriodType of the DataElement ...
> >>>>>
> >>>>> Cheers
> >>>>> Bob
> >>>>>
> >>>>> _______________________________________________
> >>>>> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> >>>>> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
> >>>>> More help   : https://help.launchpad.net/ListHelp
> >>>>
> >>>
> >>
> >>
> >
>

Follow ups

References