← Back to team overview

dhis2-devs team mailing list archive

Re: [Dhis2-users] Importing data from external system?

 

On 17 March 2012 09:15, Saptarshi Purkayastha <sunbiz@xxxxxxxxx> wrote:
> Just to add to that list of places, we are doing some integration of data
> coming from Baobab's BART systems into DHIS2 here in Malawi. We discussed
> many different methods of data import into DHIS2 and reached to conclusions
> on what solutions might be appropriate to what context when exchanging data
> between systems.
>
> In the Malawi, the dataset are fairly stable now and there is a central
> DHIS2 system.

Hi Saptarshi and all

I think that's a really critical point.  Early stage of implementation
tends to see more extreme fluctuations as the codes and datasets and
orgunit structures stabilize.  It really is a requirement to have
these stabilized to a certain extent before trying to link up various
systems to avoid reimplementing solutions over and over.

Then there are distinct but related problems of (i) sharing structural
metadadata and (ii) sharing data between systems.  In the simplest
case structural metadata is just a dataset description as you describe
in your Baobab scenario.  For that I am sure you are right - the web
api is really well suited.  And I suspect it will meet 80% of common
use cases ie.  systems reporting datasets into dhis  Though I know
Morten is at pains to point out that this API too is very recent and
will be subject to some change, though probably not too fundamental.

It does start to get more complex when you want to synchronize entire
hierarchies, groupsets etc between systems.  These problems are not
yet really solved out-of-the-box and generally still requires some
innovative scripting of custom solutions.  Some of these problems are
being addressed in the ongoing design process of the web api.

Then there is the question of communicating data.  On the xml side
there are currently 2 ways in and a variety of formats supported or
potentially supported.  This needs to be both rationalised and better
documented but there are quite a few processes happening
simultaneously:

(i) the web api.  Simple to use.  Supports xml (and json?)
datavaluesets.  Uses uid indentifiers.  Datavaluest defined in dxf2
namespace.

(ii) the stream based dxf import in the legacy import-export module.
Supports the dxf1 xml format which is currently produced on dhis2
output as well as a dxf2 datavalueset which still has minor
differences with the format used in web api.  Most important of which
is the ability to use either codes (which might be externally
assigned) as well as DHIS custom uids (a plus).  Also currently only
supports default category dataelements (a minus).  For the moment data
import which uses different disaggregations cannot be done directly in
this route

The other functionality of the stream based import is the ability to
load a custom xslt transform for incoming xml to transform it to
either dxf1 or dxf2.  This is the way, for example, that an sdmx-hd
cross sectional dataset is imported as a dxf2 datavalueset and it
works well for that.  In fact the basic schema of a dxf2 datavalueset
is strongly inspired by (and not accidentally!) by the sdmx hd schema.

In principle this does mean that any datavalueset in an xml format
where the codes are somehow mappable can be imported.

Outstanding issues which need to be solved (or solved better) as I see
it in no particular order are:
(i)  harmonising of the xml in the web api and the import module
(ii) better support for dissagregated data (without 3rd party systems
having to 'understand' categoryoptioncombo)
(iii) enhanced support for synching metadata between systems
(iv) stabilization and documentation of APIs and schemas

At the moment most interoperability problems are solvable but require
navigation of an over complex labyrinth of undocumented and
inconsistent functionality.  To be fair, this has also been due to a
lack of concrete use cases.  I have been involved in a number of
"synthetic" scenarios over the past few years where it has turned out
that either the 3rd party system didn't really exist or the apparent
use case wasn't really required at all :-)

The situation overall is greatly improved over the past year with the
intoduction of uids, the possibility to use codes to map against 3rd
party systems and the beginnings of the web-api.

I also have some useful meat now from working with Randy and team in
Rwanda. And there is also a growing interest in interoperating with
national facility registry software which may well become reality in
some countries.

I think it would be really, really useful to start collecting some of
these existing use cases - particularly concrete ones such as
described by the contributors to this thread - in some more detail.
Including those which are straightforward, those which are doable but
difficult and those which seem to elude us at present.

Regards
Bob

The Baobab system also has a common set of report that needs
> to be sent monthly. Hence the Baobab system uses the DHIS2's web-api
> dataValueSets resource to send data into DHIS. This is a simple XML report
> of datavalues that has been aggregated monthly and reported anyways by the
> Baobab system.
>  - One needs to initially do a GET on the organization unit
>  - Then GET on the selected dataset (ANC Monthly in our case)
>  - Then GET to check the ids of the data elements in a dataset
>  - Then create a dataValueSets representation and POST this
>
> We are still testing this out for continuous integration, but seems easy and
> low hanging fruit.
>
> ---
> Regards,
> Saptarshi PURKAYASTHA
>
> My Tech Blog:  http://sunnytalkstech.blogspot.com
> You Live by CHOICE, Not by CHANCE
>
>
> On 17 March 2012 09:05, Wilson,Randy <rwilson@xxxxxxx> wrote:
>>
>> Hi Mark,
>>
>>
>>
>> This is exactly what we’re doing in Rwanda.  We’ve set up one instance of
>> DHIS-2 as our HMIS (for routine data entry by health facilities across the
>> country) and a second instance as a national data warehouse/dashboard – more
>> intended for program managers, implementing partners and donors.  Bob
>> Jolliffe has been here helping us put together scripts to automatically
>> synchronize sub-sets of the data between the two instances as new data is
>> entered in the HMIS (I created a special dataset called datawarehouse in
>> HMIS that gets pushed across).  We’re also going to use the extended
>> attributes for dataelements and indicators in the data warehouse instance to
>> maintain our metadata dictionary with additional fields such as: primary
>> data source, precise definition, intended use, staff responsible for
>> collection, etc….
>>
>>
>>
>> Bringing data in from other systems is still not easy – though now that
>> many of our other data sources are web enabled it is practical.  As you
>> note, you need to use the code field in each of the major data entities
>> (dataelement, indicator, orgunit) that all systems share.  It is not
>> difficult to create a view of the period table that can be used to translate
>> periodids when importing data – for example here is the sql that gives you
>> the year, month and quarter for all periods in your period table:
>>
>>
>>
>> SELECT periodtype.name AS periodtype, period.periodid, period.startdate,
>> period.enddate, date_part('Year'::text, period.startdate) AS periodyear,
>> date_part('month'::text, period.startdate) AS periodmonth,
>>
>>         CASE
>>
>>             WHEN date_part('month'::text, period.startdate) = ANY
>> (ARRAY[1, 2, 3]) THEN 1
>>
>>             WHEN date_part('month'::text, period.startdate) = ANY
>> (ARRAY[4, 5, 6]) THEN 2
>>
>>             WHEN date_part('month'::text, period.startdate) = ANY
>> (ARRAY[7, 8, 9]) THEN 3
>>
>>             ELSE 4
>>
>>         END AS periodquarter
>>
>>    FROM period, periodtype
>>
>>   WHERE period.periodtypeid = periodtype.periodtypeid;
>>
>>
>>
>> Bob relies on DXF or similar XML import mechanisms – partly because of
>> Postgres’ requirement to assign a unique id to each record across all tables
>> whose current value is maintained in the hibernate_sequence object and it is
>> definitely the safest way to go.   I’ve found it is also relatively easy to
>> do with a combination of Excel and a visual query designer like Access –
>> linked to the Postgres tables - as long as I check and increment the current
>> value before and after imports (and nobody else is working with the
>> database)!  Of course it depends upon how similar in structure your source
>> data is to DHIS – otherwise you may need to do multiple transformations of
>> the data before hand.  If you are using a lot of category combinations
>> (age/gender, etc…) as opposed to just the default categorycombo, it is also
>> more difficult as well, because they also need to be mapped to the
>> categorycomboids.
>>
>>
>>
>> A drag and drop interface would be great… but we’re far from it now.
>>
>>
>>
>> Randy
>>
>>
>>
>>
>>
>>
>>
>> From: dhis2-users-bounces+rwilson=msh.org@xxxxxxxxxxxxxxxxxxx
>> [mailto:dhis2-users-bounces+rwilson=msh.org@xxxxxxxxxxxxxxxxxxx] On Behalf
>> Of Mark Spohr
>> Sent: Saturday, March 17, 2012 1:25 AM
>> To: dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Subject: [Dhis2-users] Importing data from external system?
>>
>>
>>
>> DHIS seems to do a good job of importing data from another DHIS system.
>> However, I would like to use the DHIS as a data warehouse to suck up data
>> from other systems in the country (vertical programs).
>> I've spent some time looking at the xml format and it looks like it could
>> be emulated by another system but will need to have the id codes for
>> periods, facilities, data elements, etc.  so it will be a bit tedious.
>> Has anyone done work on this problem.?.. I'm thinking of some tool to map
>> the external data to the DHIS dataset which would allow a "drag and drop"
>> match.
>>
>>
>> Regards,
>> Mark
>>
>> --
>> Mark Spohr, MD
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>


Follow ups