← Back to team overview

dhis2-users team mailing list archive

Re: Importing data from external system?


Yes we use use SDMX+HD to import data from openmrs based hospital system in
India.  Using it very simplistically (and therefor effectively) to read
sdmx+hd cross+sectional data messages without bothering to exchange DSD.
 The matching of codes is done manually at present because they are few,
but this needs to be enhanced to read the codelists. There is anonline demo
site of this but I am not sure if it is still being used to develop
datasets - if this is complete (I think it should be just about) I'll share
the url and you can take a look.  Uses this module, which is not
cosmetically complete or "nice" but it produces valid sdmx-hd messages
which we can consume. https://github.com/hispindia/SDMXHDataExport .

Also we have used sdmx+hd to import data from iHRIS
http://www.ihris.org/wiki/SDMX-HD_Data_Export_--_Kenya and to import data
from OpenMRS using the module developed by Jembi.  This uses an older
approach which is not so flexible but has the advantage of supporting
categorycombo data.  Fixing up multidimensional data import for dxf2 is
still to be done.

I am going to complete the export of Rwanda pbf data in SDMX-HD this week
as well (hopefully).  Will share that information when its done.  Using a
very simple approach, as there is no real api to that system, of sucking
the data from the database and formatting as an sdmx message.

There is a lot in the sdmx-hd standard which is excessive and over complex
which can be safely ignored and still remain conformant to the standard.
 If you look inside the dhis source code in the import/export module you
will see that there is a very simple xslt transform which is applied to the
incoming sdmx-hd cross-sectional message to produce a dxf2 datavalueset
 This is triggered automatically (by recognizing the root element in the
xml) so you just import the sdmx file in the same way you import any dxf


On 19 March 2012 14:10, David Smith <dsmith11@xxxxxxxxxxx> wrote:

> Has anybody been able to leverage SDMX-HD to import data from other
> systems?
> On Sat, Mar 17, 2012 at 10:22 PM, Mark Spohr <mhspohr@xxxxxxxxx> wrote:
>> Thanks for all of these great ideas.
>> The web-api sounds most interesting now.  I'll have to spend some time
>> with it.  This may be a good way to ease the difficulty of correlating all
>> of the ids with the import of external data.
>> .Mark
>> On Sat, Mar 17, 2012 at 1:47 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx>wrote:
>>> On 17 March 2012 09:15, Saptarshi Purkayastha <sunbiz@xxxxxxxxx> wrote:
>>> > Just to add to that list of places, we are doing some integration of
>>> data
>>> > coming from Baobab's BART systems into DHIS2 here in Malawi. We
>>> discussed
>>> > many different methods of data import into DHIS2 and reached to
>>> conclusions
>>> > on what solutions might be appropriate to what context when exchanging
>>> data
>>> > between systems.
>>> >
>>> > In the Malawi, the dataset are fairly stable now and there is a central
>>> > DHIS2 system.
>>> Hi Saptarshi and all
>>> I think that's a really critical point.  Early stage of implementation
>>> tends to see more extreme fluctuations as the codes and datasets and
>>> orgunit structures stabilize.  It really is a requirement to have
>>> these stabilized to a certain extent before trying to link up various
>>> systems to avoid reimplementing solutions over and over.
>>> Then there are distinct but related problems of (i) sharing structural
>>> metadadata and (ii) sharing data between systems.  In the simplest
>>> case structural metadata is just a dataset description as you describe
>>> in your Baobab scenario.  For that I am sure you are right - the web
>>> api is really well suited.  And I suspect it will meet 80% of common
>>> use cases ie.  systems reporting datasets into dhis  Though I know
>>> Morten is at pains to point out that this API too is very recent and
>>> will be subject to some change, though probably not too fundamental.
>>> It does start to get more complex when you want to synchronize entire
>>> hierarchies, groupsets etc between systems.  These problems are not
>>> yet really solved out-of-the-box and generally still requires some
>>> innovative scripting of custom solutions.  Some of these problems are
>>> being addressed in the ongoing design process of the web api.
>>> Then there is the question of communicating data.  On the xml side
>>> there are currently 2 ways in and a variety of formats supported or
>>> potentially supported.  This needs to be both rationalised and better
>>> documented but there are quite a few processes happening
>>> simultaneously:
>>> (i) the web api.  Simple to use.  Supports xml (and json?)
>>> datavaluesets.  Uses uid indentifiers.  Datavaluest defined in dxf2
>>> namespace.
>>> (ii) the stream based dxf import in the legacy import-export module.
>>> Supports the dxf1 xml format which is currently produced on dhis2
>>> output as well as a dxf2 datavalueset which still has minor
>>> differences with the format used in web api.  Most important of which
>>> is the ability to use either codes (which might be externally
>>> assigned) as well as DHIS custom uids (a plus).  Also currently only
>>> supports default category dataelements (a minus).  For the moment data
>>> import which uses different disaggregations cannot be done directly in
>>> this route
>>> The other functionality of the stream based import is the ability to
>>> load a custom xslt transform for incoming xml to transform it to
>>> either dxf1 or dxf2.  This is the way, for example, that an sdmx-hd
>>> cross sectional dataset is imported as a dxf2 datavalueset and it
>>> works well for that.  In fact the basic schema of a dxf2 datavalueset
>>> is strongly inspired by (and not accidentally!) by the sdmx hd schema.
>>> In principle this does mean that any datavalueset in an xml format
>>> where the codes are somehow mappable can be imported.
>>> Outstanding issues which need to be solved (or solved better) as I see
>>> it in no particular order are:
>>> (i)  harmonising of the xml in the web api and the import module
>>> (ii) better support for dissagregated data (without 3rd party systems
>>> having to 'understand' categoryoptioncombo)
>>> (iii) enhanced support for synching metadata between systems
>>> (iv) stabilization and documentation of APIs and schemas
>>> At the moment most interoperability problems are solvable but require
>>> navigation of an over complex labyrinth of undocumented and
>>> inconsistent functionality.  To be fair, this has also been due to a
>>> lack of concrete use cases.  I have been involved in a number of
>>> "synthetic" scenarios over the past few years where it has turned out
>>> that either the 3rd party system didn't really exist or the apparent
>>> use case wasn't really required at all :-)
>>> The situation overall is greatly improved over the past year with the
>>> intoduction of uids, the possibility to use codes to map against 3rd
>>> party systems and the beginnings of the web-api.
>>> I also have some useful meat now from working with Randy and team in
>>> Rwanda. And there is also a growing interest in interoperating with
>>> national facility registry software which may well become reality in
>>> some countries.
>>> I think it would be really, really useful to start collecting some of
>>> these existing use cases - particularly concrete ones such as
>>> described by the contributors to this thread - in some more detail.
>>> Including those which are straightforward, those which are doable but
>>> difficult and those which seem to elude us at present.
>>> Regards
>>> Bob
>>> The Baobab system also has a common set of report that needs
>>> > to be sent monthly. Hence the Baobab system uses the DHIS2's web-api
>>> > dataValueSets resource to send data into DHIS. This is a simple XML
>>> report
>>> > of datavalues that has been aggregated monthly and reported anyways by
>>> the
>>> > Baobab system.
>>> >  - One needs to initially do a GET on the organization unit
>>> >  - Then GET on the selected dataset (ANC Monthly in our case)
>>> >  - Then GET to check the ids of the data elements in a dataset
>>> >  - Then create a dataValueSets representation and POST this
>>> >
>>> > We are still testing this out for continuous integration, but seems
>>> easy and
>>> > low hanging fruit.
>>> >
>>> > ---
>>> > Regards,
>>> > Saptarshi PURKAYASTHA
>>> >
>>> > My Tech Blog:  http://sunnytalkstech.blogspot.com
>>> > You Live by CHOICE, Not by CHANCE
>>> >
>>> >
>>> > On 17 March 2012 09:05, Wilson,Randy <rwilson@xxxxxxx> wrote:
>>> >>
>>> >> Hi Mark,
>>> >>
>>> >>
>>> >>
>>> >> This is exactly what we’re doing in Rwanda.  We’ve set up one
>>> instance of
>>> >> DHIS-2 as our HMIS (for routine data entry by health facilities
>>> across the
>>> >> country) and a second instance as a national data warehouse/dashboard
>>> – more
>>> >> intended for program managers, implementing partners and donors.  Bob
>>> >> Jolliffe has been here helping us put together scripts to
>>> automatically
>>> >> synchronize sub-sets of the data between the two instances as new
>>> data is
>>> >> entered in the HMIS (I created a special dataset called datawarehouse
>>> in
>>> >> HMIS that gets pushed across).  We’re also going to use the extended
>>> >> attributes for dataelements and indicators in the data warehouse
>>> instance to
>>> >> maintain our metadata dictionary with additional fields such as:
>>> primary
>>> >> data source, precise definition, intended use, staff responsible for
>>> >> collection, etc….
>>> >>
>>> >>
>>> >>
>>> >> Bringing data in from other systems is still not easy – though now
>>> that
>>> >> many of our other data sources are web enabled it is practical.  As
>>> you
>>> >> note, you need to use the code field in each of the major data
>>> entities
>>> >> (dataelement, indicator, orgunit) that all systems share.  It is not
>>> >> difficult to create a view of the period table that can be used to
>>> translate
>>> >> periodids when importing data – for example here is the sql that
>>> gives you
>>> >> the year, month and quarter for all periods in your period table:
>>> >>
>>> >>
>>> >>
>>> >> SELECT periodtype.name AS periodtype, period.periodid,
>>> period.startdate,
>>> >> period.enddate, date_part('Year'::text, period.startdate) AS
>>> periodyear,
>>> >> date_part('month'::text, period.startdate) AS periodmonth,
>>> >>
>>> >>         CASE
>>> >>
>>> >>             WHEN date_part('month'::text, period.startdate) = ANY
>>> >> (ARRAY[1, 2, 3]) THEN 1
>>> >>
>>> >>             WHEN date_part('month'::text, period.startdate) = ANY
>>> >> (ARRAY[4, 5, 6]) THEN 2
>>> >>
>>> >>             WHEN date_part('month'::text, period.startdate) = ANY
>>> >> (ARRAY[7, 8, 9]) THEN 3
>>> >>
>>> >>             ELSE 4
>>> >>
>>> >>         END AS periodquarter
>>> >>
>>> >>    FROM period, periodtype
>>> >>
>>> >>   WHERE period.periodtypeid = periodtype.periodtypeid;
>>> >>
>>> >>
>>> >>
>>> >> Bob relies on DXF or similar XML import mechanisms – partly because of
>>> >> Postgres’ requirement to assign a unique id to each record across all
>>> tables
>>> >> whose current value is maintained in the hibernate_sequence object
>>> and it is
>>> >> definitely the safest way to go.   I’ve found it is also relatively
>>> easy to
>>> >> do with a combination of Excel and a visual query designer like
>>> Access –
>>> >> linked to the Postgres tables - as long as I check and increment the
>>> current
>>> >> value before and after imports (and nobody else is working with the
>>> >> database)!  Of course it depends upon how similar in structure your
>>> source
>>> >> data is to DHIS – otherwise you may need to do multiple
>>> transformations of
>>> >> the data before hand.  If you are using a lot of category combinations
>>> >> (age/gender, etc…) as opposed to just the default categorycombo, it
>>> is also
>>> >> more difficult as well, because they also need to be mapped to the
>>> >> categorycomboids.
>>> >>
>>> >>
>>> >>
>>> >> A drag and drop interface would be great… but we’re far from it now.
>>> >>
>>> >>
>>> >>
>>> >> Randy
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> From: dhis2-users-bounces+rwilson=msh.org@xxxxxxxxxxxxxxxxxxx
>>> >> [mailto:dhis2-users-bounces+rwilson=msh.org@xxxxxxxxxxxxxxxxxxx] On
>>> Behalf
>>> >> Of Mark Spohr
>>> >> Sent: Saturday, March 17, 2012 1:25 AM
>>> >> To: dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> >> Subject: [Dhis2-users] Importing data from external system?
>>> >>
>>> >>
>>> >>
>>> >> DHIS seems to do a good job of importing data from another DHIS
>>> system.
>>> >> However, I would like to use the DHIS as a data warehouse to suck up
>>> data
>>> >> from other systems in the country (vertical programs).
>>> >> I've spent some time looking at the xml format and it looks like it
>>> could
>>> >> be emulated by another system but will need to have the id codes for
>>> >> periods, facilities, data elements, etc.  so it will be a bit tedious.
>>> >> Has anyone done work on this problem.?.. I'm thinking of some tool to
>>> map
>>> >> the external data to the DHIS dataset which would allow a "drag and
>>> drop"
>>> >> match.
>>> >>
>>> >>
>>> >> Regards,
>>> >> Mark
>>> >>
>>> >> --
>>> >> Mark Spohr, MD
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> Mailing list: https://launchpad.net/~dhis2-users
>>> >> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> >> Unsubscribe : https://launchpad.net/~dhis2-users
>>> >> More help   : https://help.launchpad.net/ListHelp
>>> >>
>>> >
>>> >
>>> > _______________________________________________
>>> > Mailing list: https://launchpad.net/~dhis2-users
>>> > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> > Unsubscribe : https://launchpad.net/~dhis2-users
>>> > More help   : https://help.launchpad.net/ListHelp
>>> >
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~dhis2-users
>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>> More help   : https://help.launchpad.net/ListHelp
>> --
>> Mark Spohr, MD
>> mhspohr@xxxxxxxxx
>> +1 530 554 2230
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help   : https://help.launchpad.net/ListHelp

Follow ups
