← Back to team overview

dhis2-devs team mailing list archive

Re: DHIS2 - Import data from CSV - bug

 

Hi Jason,

Thanks for confirming on the duplicates. I've got the mistake now.

I'm using the CSV importer of DHIS2 itself, where you can set the "Data element ID scheme" and "Org unit ID scheme" under "more options". I used "code" for both. However, I used the PK for catoptcombos instead of the ID which is probably why DHIS2 didn't recognize them as different entries (invalid catoptcombo or something). Now it works fine.

Thanks guys!

Robin

From: Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx]
Sent: 16 September 2014 17:28
To: Robin Martens
Cc: Lars Helge Øverland; dhis2-devs@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Dhis2-devs] DHIS2 - Import data from CSV - bug

Hi Robin,

I agree, it seems, you have no duplicates here, but the spec for the CSV file is different I think. Are you not using UIDs?

It should look something like this.


"dataelement","period","orgunit","categoryoptioncombo","attributeoptioncombo","value","storedby","timestamp","comment","followup"

"DUSpd8Jq3M7","201202","gP6hn503KUX","Prlt0C1RF0s",,"7","bombali","2010-04-17",,"false"

"DUSpd8Jq3M7","201202","gP6hn503KUX","V6L425pT3A0",,"10","bombali","2010-04-17",,"false"

"DUSpd8Jq3M7","201202","OjTS752GbZE","V6L425pT3A0",,"9","bombali","2010-04-06",,"false"
Direct injection with SQL is of course also possible by transforming this file (which appears to use internal IDs) into insert statements.

Regards,
Jason


On Tue, Sep 16, 2014 at 4:47 PM, Robin Martens <martens@xxxxxxx<mailto:martens@xxxxxxx>> wrote:
Hi all,

I understand your comments, however I'm pretty sure there are no duplicates. I put the CSV file in annex, please have a look for yourself. Note that I used the "code" setting for both data elements and orgunits.

About Excel: when you select only the columns in which you want to find duplicates (in this case column A to D), it only looks at these values without looking at the values (in column F). No duplicates found. Can it be DHIS2 considers them as duplicates only based on the fields of dataelement, period, and orgunit (without the catoptcomboid field)?

SQL importing would be an alternative but is much less user-friendly.

Regards,

Robin

From: Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx<mailto:jason.p.pickering@xxxxxxxxx>]
Sent: 16 September 2014 16:31
To: Lars Helge Øverland
Cc: Robin Martens; dhis2-devs@xxxxxxxxxxxxxxxxxxx<mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx>

Subject: Re: [Dhis2-devs] DHIS2 - Import data from CSV - bug

Hi Robin,

I have seem this happen in Excel before. AFAIK, there is no way to check for discordant duplicates, which have the same orgunit/period/data element/category combo but different values. Excel will not detect these as being duplicates, if you include the value field as part of the criteria. It will only work if the values are different. I suspect in your case, you have values which are not the same, and thus, the import will fail as a single batch but possibly work as multiple batches, depending on whether or not the duplicates happen to be part of different batches.

Regards,
Jason


On Tue, Sep 16, 2014 at 4:08 PM, Lars Helge Øverland <larshelge@xxxxxxxxx<mailto:larshelge@xxxxxxxxx>> wrote:
Hi Robin,

usually the problem is internal duplicates in the import file like Olav says.

Importing piece by piece is not really proof in this case - if the duplicates are spread across two pieces then the second piece will simply update the record imported in the first instead of crashing.

We do imports in SQL bulks to improve performance and hence its a bit tricky to check for duplicates.

regards,

Lars







On Tue, Sep 16, 2014 at 3:59 PM, Robin Martens <martens@xxxxxxx<mailto:martens@xxxxxxx>> wrote:
Hi Olav,

Thanks for the answer but I don't think that's the issue.

First of all, when checking my CSV with Excel's "Duplicate data" functionality there's no duplicates found.
Secondly, when importing the same CSV file but piece by piece (per data element-catoptcombo set) I don't have this issue.

Regards,

Robin

From: Olav Poppe [mailto:olav.poppe@xxxxxx<mailto:olav.poppe@xxxxxx>]
Sent: 16 September 2014 15:26
To: Robin Martens
Cc: dhis2-devs@xxxxxxxxxxxxxxxxxxx<mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Dhis2-devs] DHIS2 - Import data from CSV - bug

Hi,
I see that your log file actually confirms this:
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "datavalue_pkey"
  Detail: Key (dataelementid, periodid, sourceid, categoryoptioncomboid, attributeoptioncomboid)=(2443, 2079, 59, 16, 16) already exists.

Olav



16. sep. 2014 kl. 15.23 skrev Olav Poppe <olav.poppe@xxxxxx<mailto:olav.poppe@xxxxxx>>:

Hi,
my experience is that when this happens (dry run works, but actual import fails with an error similar to this), it is usually caused by duplicate values in the source file i.e. multiple values with the same orgunit, data element, category and period.

Olav



16. sep. 2014 kl. 14.22 skrev Robin Martens <martens@xxxxxxx<mailto:martens@xxxxxxx>>:

Hi devs,

I'm having what seems like a bug with the import module:

When importing data from CSV, I get the following error message and the batch doesn't run. However, this only happens when importing a data element over different categoryoptioncombos and only when it is new data, not updates. In other words, I'm getting my import done piece by piece, i.e. per data element/categoryoptioncombo combination.

Also note that the import dry run doesn't detect any issues.

Any idea how to solve this?

Kind regards,

Robin



* INFO  2014-09-16 13:47:11,377 [Level: INFO, category: DATAVALUE_IMPORT, time: Tue Sep 16 13:47:11 CAT 2014, message: Process started] (InMemoryNotifier.java [taskScheduler-6])
* INFO  2014-09-16 13:47:11,469 [Level: INFO, category: DATAVALUE_IMPORT, time: Tue Sep 16 13:47:11 CAT 2014, message: Importing data values] (InMemoryNotifier.java [taskScheduler-6])
* INFO  2014-09-16 13:47:11,470 importing data values (DefaultDataValueSetService.java [taskScheduler-6])
* ERROR 2014-09-16 13:47:12,396 java.lang.RuntimeException: Failed to flush BatchHandler
                at org.amplecode.quick.batchhandler.AbstractBatchHandler.flush(AbstractBatchHandler.java:311)
                at org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSet(DefaultDataValueSetService.java:613)
                at org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSetCsv(DefaultDataValueSetService.java:388)
                at org.hisp.dhis.importexport.action.util.ImportDataValueTask.run(ImportDataValueTask.java:78)
                at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53)
                at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
                at java.util.concurrent.FutureTask.run(FutureTask.java:262)
                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                at java.lang.Thread.run(Thread.java:744)
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "datavalue_pkey"
  Detail: Key (dataelementid, periodid, sourceid, categoryoptioncomboid, attributeoptioncomboid)=(2443, 2079, 59, 16, 16) already exists.
                at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)
                at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)
                at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
                at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)
                at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)
                at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:331)
                at org.amplecode.quick.batchhandler.AbstractBatchHandler.flush(AbstractBatchHandler.java:295)
                ... 11 more
(DefaultDataValueSetService.java [taskScheduler-6])
* INFO  2014-09-16 13:47:12,396 [Level: ERROR, category: DATAVALUE_IMPORT, time: Tue Sep 16 13:47:12 CAT 2014, message: Process failed: Failed to flush BatchHandler] (InMemoryNotifier.java [taskScheduler-6])
* ERROR 2014-09-16 13:47:15,261 Left side ($summary.conflicts.size()) of '>' operation has null value at /dhis-web-importexport/importSummary.vm[line 35, column 33] (Log4JLogChute.java [http-bio-8080-exec-2])
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx<mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~dhis2-devs
More help   : https://help.launchpad.net/ListHelp

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx<mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~dhis2-devs
More help   : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx<mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~dhis2-devs
More help   : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx<mailto:dhis2-devs@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~dhis2-devs
More help   : https://help.launchpad.net/ListHelp



--
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx<mailto:jason.p.pickering@xxxxxxxxx>
tel:+46764147049



--
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx<mailto:jason.p.pickering@xxxxxxxxx>
tel:+46764147049

Follow ups

References