dhis2-users team mailing list archive
-
dhis2-users team
-
Mailing list archive
-
Message #00971
Re: Importing data from external system?
Thanks for all of these great ideas.
The web-api sounds most interesting now. I'll have to spend some time with
it. This may be a good way to ease the difficulty of correlating all of
the ids with the import of external data.
.Mark
On Sat, Mar 17, 2012 at 1:47 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
> On 17 March 2012 09:15, Saptarshi Purkayastha <sunbiz@xxxxxxxxx> wrote:
> > Just to add to that list of places, we are doing some integration of data
> > coming from Baobab's BART systems into DHIS2 here in Malawi. We discussed
> > many different methods of data import into DHIS2 and reached to
> conclusions
> > on what solutions might be appropriate to what context when exchanging
> data
> > between systems.
> >
> > In the Malawi, the dataset are fairly stable now and there is a central
> > DHIS2 system.
>
> Hi Saptarshi and all
>
> I think that's a really critical point. Early stage of implementation
> tends to see more extreme fluctuations as the codes and datasets and
> orgunit structures stabilize. It really is a requirement to have
> these stabilized to a certain extent before trying to link up various
> systems to avoid reimplementing solutions over and over.
>
> Then there are distinct but related problems of (i) sharing structural
> metadadata and (ii) sharing data between systems. In the simplest
> case structural metadata is just a dataset description as you describe
> in your Baobab scenario. For that I am sure you are right - the web
> api is really well suited. And I suspect it will meet 80% of common
> use cases ie. systems reporting datasets into dhis Though I know
> Morten is at pains to point out that this API too is very recent and
> will be subject to some change, though probably not too fundamental.
>
> It does start to get more complex when you want to synchronize entire
> hierarchies, groupsets etc between systems. These problems are not
> yet really solved out-of-the-box and generally still requires some
> innovative scripting of custom solutions. Some of these problems are
> being addressed in the ongoing design process of the web api.
>
> Then there is the question of communicating data. On the xml side
> there are currently 2 ways in and a variety of formats supported or
> potentially supported. This needs to be both rationalised and better
> documented but there are quite a few processes happening
> simultaneously:
>
> (i) the web api. Simple to use. Supports xml (and json?)
> datavaluesets. Uses uid indentifiers. Datavaluest defined in dxf2
> namespace.
>
> (ii) the stream based dxf import in the legacy import-export module.
> Supports the dxf1 xml format which is currently produced on dhis2
> output as well as a dxf2 datavalueset which still has minor
> differences with the format used in web api. Most important of which
> is the ability to use either codes (which might be externally
> assigned) as well as DHIS custom uids (a plus). Also currently only
> supports default category dataelements (a minus). For the moment data
> import which uses different disaggregations cannot be done directly in
> this route
>
> The other functionality of the stream based import is the ability to
> load a custom xslt transform for incoming xml to transform it to
> either dxf1 or dxf2. This is the way, for example, that an sdmx-hd
> cross sectional dataset is imported as a dxf2 datavalueset and it
> works well for that. In fact the basic schema of a dxf2 datavalueset
> is strongly inspired by (and not accidentally!) by the sdmx hd schema.
>
> In principle this does mean that any datavalueset in an xml format
> where the codes are somehow mappable can be imported.
>
> Outstanding issues which need to be solved (or solved better) as I see
> it in no particular order are:
> (i) harmonising of the xml in the web api and the import module
> (ii) better support for dissagregated data (without 3rd party systems
> having to 'understand' categoryoptioncombo)
> (iii) enhanced support for synching metadata between systems
> (iv) stabilization and documentation of APIs and schemas
>
> At the moment most interoperability problems are solvable but require
> navigation of an over complex labyrinth of undocumented and
> inconsistent functionality. To be fair, this has also been due to a
> lack of concrete use cases. I have been involved in a number of
> "synthetic" scenarios over the past few years where it has turned out
> that either the 3rd party system didn't really exist or the apparent
> use case wasn't really required at all :-)
>
> The situation overall is greatly improved over the past year with the
> intoduction of uids, the possibility to use codes to map against 3rd
> party systems and the beginnings of the web-api.
>
> I also have some useful meat now from working with Randy and team in
> Rwanda. And there is also a growing interest in interoperating with
> national facility registry software which may well become reality in
> some countries.
>
> I think it would be really, really useful to start collecting some of
> these existing use cases - particularly concrete ones such as
> described by the contributors to this thread - in some more detail.
> Including those which are straightforward, those which are doable but
> difficult and those which seem to elude us at present.
>
> Regards
> Bob
>
> The Baobab system also has a common set of report that needs
> > to be sent monthly. Hence the Baobab system uses the DHIS2's web-api
> > dataValueSets resource to send data into DHIS. This is a simple XML
> report
> > of datavalues that has been aggregated monthly and reported anyways by
> the
> > Baobab system.
> > - One needs to initially do a GET on the organization unit
> > - Then GET on the selected dataset (ANC Monthly in our case)
> > - Then GET to check the ids of the data elements in a dataset
> > - Then create a dataValueSets representation and POST this
> >
> > We are still testing this out for continuous integration, but seems easy
> and
> > low hanging fruit.
> >
> > ---
> > Regards,
> > Saptarshi PURKAYASTHA
> >
> > My Tech Blog: http://sunnytalkstech.blogspot.com
> > You Live by CHOICE, Not by CHANCE
> >
> >
> > On 17 March 2012 09:05, Wilson,Randy <rwilson@xxxxxxx> wrote:
> >>
> >> Hi Mark,
> >>
> >>
> >>
> >> This is exactly what we’re doing in Rwanda. We’ve set up one instance
> of
> >> DHIS-2 as our HMIS (for routine data entry by health facilities across
> the
> >> country) and a second instance as a national data warehouse/dashboard –
> more
> >> intended for program managers, implementing partners and donors. Bob
> >> Jolliffe has been here helping us put together scripts to automatically
> >> synchronize sub-sets of the data between the two instances as new data
> is
> >> entered in the HMIS (I created a special dataset called datawarehouse in
> >> HMIS that gets pushed across). We’re also going to use the extended
> >> attributes for dataelements and indicators in the data warehouse
> instance to
> >> maintain our metadata dictionary with additional fields such as: primary
> >> data source, precise definition, intended use, staff responsible for
> >> collection, etc….
> >>
> >>
> >>
> >> Bringing data in from other systems is still not easy – though now that
> >> many of our other data sources are web enabled it is practical. As you
> >> note, you need to use the code field in each of the major data entities
> >> (dataelement, indicator, orgunit) that all systems share. It is not
> >> difficult to create a view of the period table that can be used to
> translate
> >> periodids when importing data – for example here is the sql that gives
> you
> >> the year, month and quarter for all periods in your period table:
> >>
> >>
> >>
> >> SELECT periodtype.name AS periodtype, period.periodid,
> period.startdate,
> >> period.enddate, date_part('Year'::text, period.startdate) AS periodyear,
> >> date_part('month'::text, period.startdate) AS periodmonth,
> >>
> >> CASE
> >>
> >> WHEN date_part('month'::text, period.startdate) = ANY
> >> (ARRAY[1, 2, 3]) THEN 1
> >>
> >> WHEN date_part('month'::text, period.startdate) = ANY
> >> (ARRAY[4, 5, 6]) THEN 2
> >>
> >> WHEN date_part('month'::text, period.startdate) = ANY
> >> (ARRAY[7, 8, 9]) THEN 3
> >>
> >> ELSE 4
> >>
> >> END AS periodquarter
> >>
> >> FROM period, periodtype
> >>
> >> WHERE period.periodtypeid = periodtype.periodtypeid;
> >>
> >>
> >>
> >> Bob relies on DXF or similar XML import mechanisms – partly because of
> >> Postgres’ requirement to assign a unique id to each record across all
> tables
> >> whose current value is maintained in the hibernate_sequence object and
> it is
> >> definitely the safest way to go. I’ve found it is also relatively
> easy to
> >> do with a combination of Excel and a visual query designer like Access –
> >> linked to the Postgres tables - as long as I check and increment the
> current
> >> value before and after imports (and nobody else is working with the
> >> database)! Of course it depends upon how similar in structure your
> source
> >> data is to DHIS – otherwise you may need to do multiple transformations
> of
> >> the data before hand. If you are using a lot of category combinations
> >> (age/gender, etc…) as opposed to just the default categorycombo, it is
> also
> >> more difficult as well, because they also need to be mapped to the
> >> categorycomboids.
> >>
> >>
> >>
> >> A drag and drop interface would be great… but we’re far from it now.
> >>
> >>
> >>
> >> Randy
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> From: dhis2-users-bounces+rwilson=msh.org@xxxxxxxxxxxxxxxxxxx
> >> [mailto:dhis2-users-bounces+rwilson=msh.org@xxxxxxxxxxxxxxxxxxx] On
> Behalf
> >> Of Mark Spohr
> >> Sent: Saturday, March 17, 2012 1:25 AM
> >> To: dhis2-users@xxxxxxxxxxxxxxxxxxx
> >> Subject: [Dhis2-users] Importing data from external system?
> >>
> >>
> >>
> >> DHIS seems to do a good job of importing data from another DHIS system.
> >> However, I would like to use the DHIS as a data warehouse to suck up
> data
> >> from other systems in the country (vertical programs).
> >> I've spent some time looking at the xml format and it looks like it
> could
> >> be emulated by another system but will need to have the id codes for
> >> periods, facilities, data elements, etc. so it will be a bit tedious.
> >> Has anyone done work on this problem.?.. I'm thinking of some tool to
> map
> >> the external data to the DHIS dataset which would allow a "drag and
> drop"
> >> match.
> >>
> >>
> >> Regards,
> >> Mark
> >>
> >> --
> >> Mark Spohr, MD
> >>
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~dhis2-users
> >> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
> >> Unsubscribe : https://launchpad.net/~dhis2-users
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~dhis2-users
> > Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~dhis2-users
> > More help : https://help.launchpad.net/ListHelp
> >
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help : https://help.launchpad.net/ListHelp
>
--
Mark Spohr, MD
mhspohr@xxxxxxxxx
+1 530 554 2230
Follow ups
References