← Back to team overview

dhis2-users team mailing list archive

Re: Cross database querying for updating data

 

Hi there. I think the big challenge here is what is not there, as opposed
to what needs to be added.   DHIS2 has no functionality for "soft deletes",
meaning when you mark a record with a column like "deletedon" with a
timestamp for deletion, but you actually do not remove it from the
database. This is a common problem with DXF syncronizations, as the
upstream system does not transmit DELETES. It only transmits INSERTS and
UPDATES.

At first glance, in your case, it would seem to be pretty easy, since you
can use the "lastupdated" of the data value table to parse out values which
have been added since your last synchronization. The problem however is to
determine what has been DELETED from the data value table.Well, that is
more tricky, and would require you diff both data values tables to
determine what changes you need to transmit to your downstream system.

There are tools out there, to help determine the "diff" between two
databases, which I think in your case would require you to have copy of
your remote DB and then diff it against your local DB. You would then
prepare a diff, make the update to your local copy and then you remote
copy.

I think that would really be the only real way of doing this, using a tool
like perhaps these (
http://stackoverflow.com/questions/4804779/how-to-check-difference-between-two-databases-in-p)
other other commercial alternatives to diff the two databases, and prepare
a series of INSERT and DELETE statements. Never attempted it myself, but it
might work.

Does not seem really that trivial. :(

Regards,
Jason


On Mon, Apr 20, 2015 at 6:59 PM Knut Staring <knutst@xxxxxxxxx> wrote:

> I think I would trust UIDs or Codes more than database IDs
>
> On Mon, Apr 20, 2015 at 10:48 PM, Muyepa A. <muyepaa@xxxxxxxxx> wrote:
>
>> Indeed Database A has extra data not available in B while  B has data
>> missing in A.  No changes have been made to orgunits, dataelements,
>> periods or users. The delta is in data-values. I would really want to get
>> views on whether the operation described (all other parameters being equal)
>> will bring the missing data into A, and whether it may affect report
>> summary rates.
>> I cannot easily download the big databases. Something
>> on command-line would be very suitable for this circumstance  and
>> transparent to users.
>>
>> On Mon, Apr 20, 2015 at 5:08 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>> wrote:
>>
>>> If database B is restored off a backup of database A then it will
>>> probably work with some caveats.  Clearly they are not exactly the
>>> same because you've got extra data in A which you are trying to push
>>> into B.  So it depends really on what else might have changed in A.
>>>
>>> If you have made any changes to orgunits, datelements or periods then
>>> that will cause at best a database integrity violation when you try to
>>> import that way.  At worst cause a very horrible mess.
>>>
>>> Even though the process is fragile, there is a related use case where
>>> one might need to do such a thing or something similar.  If a country
>>> system is hosted somewhere in the cloud (because they have poor or
>>> expensive local bandwidth) then there is a real problem to solve about
>>> getting data backups from the server back into the country.
>>>
>>> In particular we'd like to be able to pull not all the datavalues but
>>> only those which have been changed ie. do an incremental backup of the
>>> datavalues table.  The same could apply to events.
>>>
>>> One possible approach might be to first do a metadata backup and
>>> restore.  And then pull the changed datavalues - though you would have
>>> to do this through the api with uids or codes as the metadata backup
>>> and restore would have ignored the database identifiers.
>>>
>>> Yet another approach would be to take a reduced postgres backup
>>> excluding the datavalues table, restore that and then try and pull in
>>> the delta of datavalues table something like you describe.
>>>
>>> Its a tough problem.  Jason, what do you suggest?  Downloading the
>>> full postgres backup (with ephemeral tables like analytics excluded)
>>> can be a serious challenge for many countries to do with regularity
>>> and reliability.  So how best to go about incremental backup and
>>> restore of data?
>>>
>>> On 20 April 2015 at 15:36, Muyepa A. <muyepaa@xxxxxxxxx> wrote:
>>> > I had thought of  data import/export functionality, however this will
>>> > require two instances of tomcat to be running. (or one instance with
>>> one
>>> > database at a time, however there will be an associated down time).
>>> > Since upgrading to 2.18, when exported, the resultant download zip
>>> file says
>>> > invalid for all formats (xml, json, excel) and is only 2kb in size
>>> > regardless of the hierarchy I choose.
>>> >
>>> > Since one database is copy of the other, i thought I could easily
>>> update
>>> > from the other as described.
>>> >
>>> > On Mon, Apr 20, 2015 at 3:48 PM, Jason Pickering
>>> > <jason.p.pickering@xxxxxxxxx> wrote:
>>> >>
>>> >> It may work, but in general, it will not because the internal IDs of
>>> >> different DHIS2 databases are not the same. So, it might work, but it
>>> >> requires both databases to be in essentially the same state. Better
>>> to use
>>> >> the data import/export functionality if you can.
>>> >>
>>> >> Regards,
>>> >> Jason
>>> >>
>>> >>
>>> >> On Mon, Apr 20, 2015 at 3:28 PM Muyepa A. <muyepaa@xxxxxxxxx> wrote:
>>> >>>
>>> >>> Can I safely update DHIS2 database A with data from DHIS database B
>>> both
>>> >>> on the same PostgreSQL server by cross query. B is backup of A
>>> >>>
>>> >>> From database B, I can retrieve and save the data that is required
>>> using:
>>> >>>
>>> >>> psql B -c "\copy (select * from datavalue where  periodid='xxxx') TO
>>> >>> STDOUT" > /tmp/data.tsv;
>>> >>>
>>> >>>
>>> >>> And to restore:
>>> >>> psql A -c "\copy datavaue (x, y, z, ...) FROM /tmp/data.tsv"
>>> >>>
>>> >>>
>>> >>> Is this recommended, and what else should be cross  transferred.
>>> >>>
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> Mailing list: https://launchpad.net/~dhis2-users
>>> >>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> >>> Unsubscribe : https://launchpad.net/~dhis2-users
>>> >>> More help   : https://help.launchpad.net/ListHelp
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Mailing list: https://launchpad.net/~dhis2-users
>>> > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> > Unsubscribe : https://launchpad.net/~dhis2-users
>>> > More help   : https://help.launchpad.net/ListHelp
>>> >
>>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Knut Staring
> Dept. of Informatics, University of Oslo
> Norway: +4791880522
> Skype: knutstar
> http://dhis2.org
>  _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References