← Back to team overview

dhis2-users team mailing list archive

Re: Importing DHS survey data in DHIS

 

Ok, that sounds like ADX might even be a bit slower eventually, if the transformation process outweighs a potentially reduced datavolume. I might just stick with the json.

@Jason: I also thought about SQL-Injection shortly, but I am fearing internal changes of the data-model, which I'd have to understand fully in the first place. Of course the api's also change more than I expected, but at least that is announced :-D

Uwe

---


Am 02.02.2016 um 19:49 schrieb Lars Helge Øverland:
Hi Uwe,

ADX will not be faster than DXF, as for ADX, the stream is first converted into DXF and then passed on to the regular importer.

Lars

On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering <jason.p.pickering@xxxxxxxxx <mailto:jason.p.pickering@xxxxxxxxx>> wrote:

This was a very trivial lab test,so not really conclusive at all. I would just give it a try and see. If you see differences, please
    let the devs know.

    Given the scale of what you are attempting, have you considered
    using direct SQL injection? Not that I am recommending that route
    as there are many pitfalls, but it might be an option if
    implemented properly, especially considering your reported
    architecture.

    Regards
    Jason


    On Tue, Feb 2, 2016, 17:04 Uwe Wahser <uwe@xxxxxxxxx
    <mailto:uwe@xxxxxxxxx>> wrote:

        Hi Jason,

        thanks for sharing the links. As I can see on a quick glance,
        you are also
        experimenting with the ADX-api - did you observe any
        significant performance
        differences between ADX and dataValueSets apis?

        Regards,

        Uwe

        > Jason Pickering <jason.p.pickering@xxxxxxxxx
        <mailto:jason.p.pickering@xxxxxxxxx>> hat am 2. Februar 2016
        um 18:21
        > geschrieben:
        >
        >
        > Hi Olav,
        > I have not worked with the DHS API per se, but have imported
        lots of data
        > using the same approach which they outline here (
        > http://api.dhsprogram.com/#/samples-r.cfm)
        >
        > I have written up a walkthrough of getting data out of one
        DHIS instance
        > and into another one, and I think the basic principles would
        be the same (
        > http://rpubs.com/jason_p_pickering/139589)
        >
        > Metadata needs to be mapped (or created), the data needs to
        be reshaped,
        > and correctly formatted.
        >
        > It should not be too difficult. I used R, but there are
        other examples with
        > Python and JavaScript on their examples page.
        >
        > Regards,
        > Jason
        >
        >
        > On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye
        <atumwesigye@xxxxxxxxx <mailto:atumwesigye@xxxxxxxxx>>
        > wrote:
        >
        > > Dear Uwe,
        > >
        > > Have you tried to send data via the endpoint
        api/dataValueSets, it may be
        > > faster. Just stage your data and push it once.
        > >
        > >
        http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
        > >
        > > Also to note, is how you send it, I have seen curl taking
        ages to submit
        > > individual values via the api. You need to send it as once
        file via once
        > > request or implement concurrency.
        > >
        > > Alex
        > >
        > > On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe
        <olav.poppe@xxxxxx <mailto:olav.poppe@xxxxxx>> wrote:
        > >
        > >> Hi Randy and Uwe,
        > >> thanks, interesting to hear you experiences. Uwe, what
        you are working on
        > >> sounds quite a bit more complicated, and not least with
        far more data. I
        > >> image that with household surveys, it would be a matter
        of < 100 indicators
        > >> for < 200 orgunits for 2-3 periods, i.e. a fraction of
        what you are dealing
        > >> with!
        > >>
        > >> Olav
        > >>
        > >>
        > >>
        > >>
        > >>
        > >>
        > >> 31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@xxxxxxxxx
        <mailto:uwe@xxxxxxxxx>>:
        > >>
        > >> Hi Olav & Randy,
        > >>
        > >> I am currently banging on kettle (aka Pentaho DI) to
        extract data from a
        > >> source-system (SQL-ERP in our case) into DHIS2 dataSets
        in json format. In
        > >> our current test-scenario (2 dataElements in a dataSet with a
        > >> categoryCombination of 5 categories) we are currently
        updating ca. 4 mio
        > >> dataValues every night in a pseudo-delta mode (reading
        all data from
        > >> source, comparing to what is there in DHIS2 already, then
        only pushing
        > >> records for creating, updating or deleting dataValues
        into the api: ca.
        > >> 150k per night in 1 hour, initial load was 7hrs). We
        still have to prove,
        > >> that this is feasible when setting up the first real life
        dataSet where
        > >> there will be more categories and more dataElements, thus
        exploding the
        > >> number of dataValues.
        > >>
        > >> Getting there was a bit painful, but now it seems to
        work. I chose kettle
        > >> instead of Talend ETL (both open source) as it seemed to
        be easier to get
        > >> used to. However, from a data warehouse perspective I'd
        prefer to have
        > >> DHIS2 offering some sort of an integrated ETL landscape
        on the long run,
        > >> which would also allow to aggregate data from tracker
        into dataSets,
        > >> tracker to tracker, dataSets to dataSets etc.
        > >>
        > >> Our current version of the kettle transformations and
        jobs were designed
        > >> to be generic (not for a specific dataSet, but you have
        to design your own
        > >> extractor which could be a simple csv-reader or maybe a
        DHS api-call). If
        > >> you are interested, I will share them. Just be aware that
        they are
        > >> currently in a very early and rough state and not
        documented. You'd have to
        > >> bring along the willingness to dig yourself into kettle
        and be pain
        > >> resistant to a certain degree :-)
        > >>
        > >> I'd be interested to hear from other experiences ...
        > >>
        > >> Have a nice sunday,
        > >>
        > >> Uwe
        > >>
        > >> ---
        > >>
        > >> Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
        > >>
        > >> Not here unfortunately...just doing csv imports from DHS
        Excel files.
        > >> Would be useful for our data warehouse.
        > >> Randy
        > >> On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@xxxxxx
        <mailto:olav.poppe@xxxxxx>> wrote:
        > >>
        > >>> Hi all,
        > >>> I wanted to hear if anyone has any experience with the
        DHS API (
        > >>> http://api.dhsprogram.com/#/index.html), and using it to
        import survey
        > >>> results into DHIS?
        > >>>
        > >>> Olav
        > >>>
        > >>> _______________________________________________
        > >>> Mailing list: https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > >>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
        <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx>
        > >>> Unsubscribe : https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > >>> More help   : https://help.launchpad.net/ListHelp
        > >>>
        > >>>
        > >> *This message and its attachments are confidential and
        solely for the
        > >> intended recipients. If received in error, please delete
        them and notify
        > >> the sender via reply e-mail immediately.*
        > >>
        > >> _______________________________________________
        > >> Mailing list: https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > >> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
        <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx>
        > >> Unsubscribe : https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > >> More help   : https://help.launchpad.net/ListHelp
        > >>
        > >>
        > >>
        > >>
        > >> _______________________________________________
        > >> Mailing list: https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > >> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
        <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx>
        > >> Unsubscribe : https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > >> More help   : https://help.launchpad.net/ListHelp
        > >>
        > >>
        > >
        > >
        > > --
        > > Alex Tumwesigye
        > >
        > > Technical Advisor - DHIS2 (Consultant),
        > > Ministry of Health/AFENET
        > > Kampala
        > > Uganda
        > >
        > > IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
        > >
        > > IT Specialist (Servers, Networks and Security, Health
        Information Systems
        > > - DHIS2 ) & Solar Consultant
        > >
        > > +256 774149 775, + 256 759 800161
        <tel:%2B%20256%20759%20800161>
        > >
        > > "I don't want to be anything other than what I have been -
        one tree hill "
        > >
        > > _______________________________________________
        > > Mailing list: https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
        <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx>
        > > Unsubscribe : https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > > More help   : https://help.launchpad.net/ListHelp
        > >
        > >
        >
        >
        > --
        > Jason P. Pickering
        > email: jason.p.pickering@xxxxxxxxx
        <mailto:jason.p.pickering@xxxxxxxxx>
        > tel:+46764147049 <tel:%2B46764147049>
        > _______________________________________________
        > Mailing list: https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
        <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx>
        > Unsubscribe : https://launchpad.net/~dhis2-users
        <https://launchpad.net/%7Edhis2-users>
        > More help   : https://help.launchpad.net/ListHelp


    _______________________________________________
    Mailing list: https://launchpad.net/~dhis2-users
    <https://launchpad.net/%7Edhis2-users>
    Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
    <mailto:dhis2-users@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~dhis2-users
    <https://launchpad.net/%7Edhis2-users>
    More help   : https://help.launchpad.net/ListHelp




--
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org <https://www.dhis2.org>



Follow ups

References