← Back to team overview

dhis2-devs team mailing list archive

Re: [Bug 1597724] Re: Import of GML file update lastupdated date of all objects with co-ordinates

 

Elmarie,

It is a weakness with ANY import process in DHIS2 that there is no
differentiation between actual updates - i.e. where an imported value is
different from the existing one - and all other imported data records that
match existing data records with identical values. All imported records
matched to existing records are regarded as "updates", which is only
differentiated from valid records with no match ("new" records). (valid as
in having a valid primary key etc).

DHIS 1.4 provides a far more fine-grained control over any import process,
because imported data records are automatically identified as
- New data records
- Existing data records with different value and where the lastupdated is
newer (i.e. source data value has been updated more recently than the
destination database value)
- Existing data records with different value and where the lastupdated is
older (i.e. the destination database value has been updated more recently
than the source database value)
- Existing data records with identical value (lastupdated value is in that
case not relevant).
During the import process, the user can review these different categories
of imported data and decide whether to accept or reject them. That is not
possible with the DHIS2 import model - everything in the imported file that
fits existing primary keys will be imported/updated automatically.

I'm not 100% sure why the core team opted for this very "basic" import
methodology, but I suspect it's largely related to the assumption that most
DHIS2 instances will be national single instances without much need for
import or export of data. I suspect that changing the import methodology to
give users the same high-granularity control as in 1.4 would require a
major effort, but diversifying the treatment of the lastupdated field only
should be a lot simpler - presumably.

Halvdan's suggestion that "if you wish to avoid this you can give the
importer a GML file which only contains the orgunits you actually wish to
update" is not a very practical option, regrettably. If, as in our typical
case, you have somewhere between 5,000 and 40,0000 coordinates with
additions and corrections taking place regularly through various external
spatial databases, your primary option will be to access the database
instance directly and run queries that will identify any new or updated
co-ordinates by comparing the current external data set with whatever is
currently stored in the instance. After identifying the sub-set of new or
updated co-ordinates that way, you can then generate a new GML file OR you
might rather use the same queries to update the organisationunit table
directly.

(Another more theoretical option will be that somebody keep track of all
such changes in those external spatial databases and GIS systems, but since
spatial data development processes typically incorporate a multitude of
public and private organisations, it's again not practical to keep track of
all the updates and additions they generate).

For now, I guess the most efficient method will be to drop using GML file
imports and instead update our PostgreSQL database instances directly with
ACTUAL updates. It's not a very attractive option because it reduces
security and increase the possibility of somebody making crippling updates
to the database. Alternatively we use the database queries to identify new
and actually updated coordinates, and then go through the process of
generating a GML file with only those records (the latter method is more
cumbersome, but it will work with Production instances where we have
disabled direct database access for security reasons).

I'll make sure we raise this issue again during our meetings in Oslo in
August. Halvdan might find himself voted down on this one, we'll see ;-)

Regards
Calle

On 30 June 2016 at 17:05, Halvdan Hoem Grelland <halvdan@xxxxxxxxx> wrote:

> I've investigated and retract that this is a bug. I guess I
> misunderstood at first, but reading your report again makes it clear
> that you are actually supplying a GML file with coordinates for all
> orgunits, including those which already have coordinates.
>
> As you are, in fact, importing coordinates for all of these orgunits,
> it's technically not wrong that the lastUpdated is also set to reflect
> this. We don't really discern between old vs. new value of metadata when
> updating on import.
>
> If you wish to avoid this you can give the importer a GML file which
> only contains the orgunits you actually wish to update.
>
> Setting this one to won't fix.
>
> ** Changed in: dhis2
>        Status: Confirmed => Invalid
>
> --
> You received this bug notification because you are a member of DHIS 2
> developers, which is subscribed to DHIS.
> https://bugs.launchpad.net/bugs/1597724
>
> Title:
>   Import of GML file update lastupdated date of all objects with co-
>   ordinates
>
> Status in DHIS:
>   Invalid
>
> Bug description:
>   Importing of a GML file updates the "lastupdated" date of all
>   organisationunit with GIS co-ordinates to the date of the import.
>
>   Not sure if this is by design but it creates a problem for facility
>   registries in that it appears that all organisationunits with co-
>   ordinates had some field updated every time a GML file is imported
>   when in fact only some of the objects had updates.
>
>   Ideally the import should only update new/real updates and for those
>   update the GIS co-ordinates instead of overwriting all co-ordinates
>   and seeing them as updates.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>



-- 

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@xxxxxxxxx

Skype: calle_hedberg

*******************************************

Follow ups

References