← Back to team overview

dhis2-users team mailing list archive

Re: Data Import tools for DHIS2

 

Hello Roger,

Thanks for your experience. It will help us a lot.
Your experience is important for me.
We have been defined all OrganisationUnit data except facility by manually.
I am started to enter facility level of OrganisationUnit data manually.
I needed more sophisticated way than manual input because we have a excel
data from other system expected regular basis for update when we are going
production run.
Also, please refer E-mail to me from Samuel Cheburet (My colleague), he
attached our sample of health information data to be import.

Thank you very much,
     Shinichi Suzuki

-----Original Message-----
From: Friedman, Roger (CDC/CGH/DGHA) (CTR) [mailto:rdf4@xxxxxxx]
Sent: Wednesday, November 10, 2010 3:20 AM
To: Shinichi Suzuki; Knut Staring
Cc: dhis2-users@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Dhis2-users] Data Import tools for DHIS2

Shinchi --
	I am migrating a legacy Access system into DHIS.
	For the org units, I am staging them in Access using a somewhat
extended version of the OrganisationUnit table, whose structure was
imported (install the Postgres or MySQL ODBC driver).  I am doing a fair
amount of data cleaning, the biggest deal being making the names unique
as the old system had separate tables at each level of the hierarchy, so
could handle a facility named Kaneshi in a subdistrict named Kaneshi in
a district named Kaneshi, while another district named Kaneshie existed
in a different region.  There is also the matter of determining what
facilities are still (or were ever) active, which I am doing by record
counts from the legacy system by year by facility by form/data table
(which correspond to each other in this system).  The only useful org
unit attributes are type and owner, for which I have built a org unit
group set/org unit group table.  Some other data like latitude and
longitude I am getting from an old site survey (I have hijacked GeoCode
to represent town); at the cost of matching the two sources of
organizational data, this other data allows me to validate some of the
data in the legacy system as well.  Once the data is clean, I use Access
procedures to write SQL scripts loading the org unit groups and group
sets, then loading source, org unit and group set member.   Then I go
into DHIS and do maintenance tasks to get the internal tables in synch
with the uploaded data.
	Org unit Level and period I do manually via a command line or
visual database tool.
	The next step is to define category options and option combos,
data elements, computed variables and datasets.  This I do in DHIS2.
Then I link the Access staging DB to these tables to get the ids I need
to fill in data values and completedatasetregistration.  Again this is
done with generated SQL scripts, one for each form.  Some aggregation
takes place at this step due to differences in the organizational model
between these levels.  I do a custom form for the dataset and run the
dataset report which I can compare to the reports from the legacy
system.  Expect to run these import processes several times, so
eliminate old data for the same dataset and period before loading new.
	This is all quite tedious, but I do a good bit of quality
control.  I am aware of Pentaho, but outside of the learning curve which
people much better than I have failed to scale, there is just no
substitute for exploratory data analysis to make sure your data is clean
before you start.  And by the way, don't try to write to the database
from Access via ODBC, it's dog slow and gets slower as you add more data
until it dies altogether in midstream.
	I am willing to send you or post some of this stuff, but it's so
specific I'm afraid it will not be worth the effort, just learn the data
model and hack away.
Good luck, Roger

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@xxxxxxxxxxxxxxxxxxx
[mailto:dhis2-users-bounces+rdf4=cdc.gov@xxxxxxxxxxxxxxxxxxx] On Behalf
Of Shinichi Suzuki
Sent: Tuesday, November 09, 2010 8:36 AM
To: 'Knut Staring'
Cc: dhis2-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access
based
tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your
concerning
will be disappeared.

Best regards,
    Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@xxxxxxxxx]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@xxxxxxxxx; dhis2-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:
http://kettle.pentaho.com/
http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut


On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@xxxxxxxxx>
wrote:
> Hello, Knut and dhis2_devs, users
>
> Do you have a data import tool written by Python and available to use?
> I need the tools for the following area.
> 1. Existing HIS EXCEL data import to DHIS2. We want to load about
several
> years historical data which has been corrected into EXCEL files.
> 2. Import from Master Facility List (Web based Facility Master
maintenance
> system) to DHIS2 for facility data synchronization. This MFL has a
> capability to export the data into EXCEL file. But MFL does not have a
> "Short name" and need to add the data manually. Then want to import
it.
>
> If you do not have it or not available to use it, please advise me if
> possible.
> I believe this import capability is important for the new user of DHIS
and
> also could use inter system connection more widely.
>
> Best regards,
>  Shinichi Suzuki
> -----------------------------------------------------
> Shinichi Suzuki
> MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
> E-Mail: shin461@xxxxxxxxx
> Phone: 0712-754-963
> (JICA Senior Volunteer 21-4)
> -----------------------------------------------------
>
>
>
>



--
Cheers,
Knut Staring



_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~dhis2-users
More help   : https://help.launchpad.net/ListHelp






References