← Back to team overview

dhis2-devs team mailing list archive

Re: DHIS2 - Import data from CSV - bug

 

Hi Robin,

I agree, it seems, you have no duplicates here, but the spec for the CSV
file is different I think. Are you not using UIDs?

It should look something like this.

"dataelement","period","orgunit","categoryoptioncombo","attributeoptioncombo","value","storedby","timestamp","comment","followup"
"DUSpd8Jq3M7","201202","gP6hn503KUX","Prlt0C1RF0s",,"7","bombali","2010-04-17",,"false"
"DUSpd8Jq3M7","201202","gP6hn503KUX","V6L425pT3A0",,"10","bombali","2010-04-17",,"false"
"DUSpd8Jq3M7","201202","OjTS752GbZE","V6L425pT3A0",,"9","bombali","2010-04-06",,"false"

Direct injection with SQL is of course also possible by transforming this
file (which appears to use internal IDs) into insert statements.

Regards,
Jason


On Tue, Sep 16, 2014 at 4:47 PM, Robin Martens <martens@xxxxxxx> wrote:

>  Hi all,
>
>
>
> I understand your comments, however I'm pretty sure there are no
> duplicates. I put the CSV file in annex, please have a look for yourself.
> Note that I used the "code" setting for both data elements and orgunits.
>
>
>
> About Excel: when you select only the columns in which you want to find
> duplicates (in this case column A to D), it only looks at these values
> without looking at the values (in column F). No duplicates found. Can it be
> DHIS2 considers them as duplicates only based on the fields of dataelement,
> period, and orgunit (without the catoptcomboid field)?
>
>
>
> SQL importing would be an alternative but is much less user-friendly.
>
>
>
> Regards,
>
>
>
> Robin
>
>
>
> *From:* Jason Pickering [mailto:jason.p.pickering@xxxxxxxxx]
> *Sent:* 16 September 2014 16:31
> *To:* Lars Helge Øverland
> *Cc:* Robin Martens; dhis2-devs@xxxxxxxxxxxxxxxxxxx
>
> *Subject:* Re: [Dhis2-devs] DHIS2 - Import data from CSV - bug
>
>
>
> Hi Robin,
>
>
>
> I have seem this happen in Excel before. AFAIK, there is no way to check
> for discordant duplicates, which have the same orgunit/period/data
> element/category combo but different values. Excel will not detect these as
> being duplicates, if you include the value field as part of the criteria.
> It will only work if the values are different. I suspect in your case, you
> have values which are not the same, and thus, the import will fail as a
> single batch but possibly work as multiple batches, depending on whether or
> not the duplicates happen to be part of different batches.
>
>
>
> Regards,
>
> Jason
>
>
>
>
>
> On Tue, Sep 16, 2014 at 4:08 PM, Lars Helge Øverland <larshelge@xxxxxxxxx>
> wrote:
>
> Hi Robin,
>
>
>
> usually the problem is internal duplicates in the import file like Olav
> says.
>
>
>
> Importing piece by piece is not really proof in this case - if the
> duplicates are spread across two pieces then the second piece will simply
> update the record imported in the first instead of crashing.
>
>
>
> We do imports in SQL bulks to improve performance and hence its a bit
> tricky to check for duplicates.
>
>
>
> regards,
>
>
>
> Lars
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Sep 16, 2014 at 3:59 PM, Robin Martens <martens@xxxxxxx> wrote:
>
> Hi Olav,
>
>
>
> Thanks for the answer but I don't think that's the issue.
>
>
>
> First of all, when checking my CSV with Excel's "Duplicate data"
> functionality there's no duplicates found.
>
> Secondly, when importing the same CSV file but piece by piece (per data
> element-catoptcombo set) I don't have this issue.
>
>
>
> Regards,
>
>
>
> Robin
>
>
>
> *From:* Olav Poppe [mailto:olav.poppe@xxxxxx]
> *Sent:* 16 September 2014 15:26
> *To:* Robin Martens
> *Cc:* dhis2-devs@xxxxxxxxxxxxxxxxxxx
> *Subject:* Re: [Dhis2-devs] DHIS2 - Import data from CSV - bug
>
>
>
> Hi,
>
> I see that your log file actually confirms this:
>
> Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value
> violates unique constraint "datavalue_pkey"
>
>   Detail: Key (dataelementid, periodid, sourceid, categoryoptioncomboid,
> attributeoptioncomboid)=(2443, 2079, 59, 16, 16) already exists.
>
>
>
> Olav
>
>
>
>
>
>
>
>  16. sep. 2014 kl. 15.23 skrev Olav Poppe <olav.poppe@xxxxxx>:
>
>
>
> Hi,
>
> my experience is that when this happens (dry run works, but actual import
> fails with an error similar to this), it is *usually* caused by duplicate
> values in the source file i.e. multiple values with the same orgunit, data
> element, category and period.
>
>
>
> Olav
>
>
>
>
>
>
>
>  16. sep. 2014 kl. 14.22 skrev Robin Martens <martens@xxxxxxx>:
>
>
>
> Hi devs,
>
>
>
> I'm having what seems like a bug with the import module:
>
>
>
> When importing data from CSV, I get the following error message and the
> batch doesn't run. However, this only happens when importing a data element
> over different categoryoptioncombos and only when it is new data, not
> updates. In other words, I'm getting my import done piece by piece, i.e.
> per data element/categoryoptioncombo combination.
>
>
>
> Also note that the import dry run doesn't detect any issues.
>
>
>
> Any idea how to solve this?
>
>
>
> Kind regards,
>
>
>
> Robin
>
>
>
>
>
>
>
> * INFO  2014-09-16 13:47:11,377 [Level: INFO, category: DATAVALUE_IMPORT,
> time: Tue Sep 16 13:47:11 CAT 2014, message: Process started]
> (InMemoryNotifier.java [taskScheduler-6])
>
> * INFO  2014-09-16 13:47:11,469 [Level: INFO, category: DATAVALUE_IMPORT,
> time: Tue Sep 16 13:47:11 CAT 2014, message: Importing data values]
> (InMemoryNotifier.java [taskScheduler-6])
>
> * INFO  2014-09-16 13:47:11,470 importing data values
> (DefaultDataValueSetService.java [taskScheduler-6])
>
> * ERROR 2014-09-16 13:47:12,396 java.lang.RuntimeException: Failed to
> flush BatchHandler
>
>                 at
> org.amplecode.quick.batchhandler.AbstractBatchHandler.flush(AbstractBatchHandler.java:311)
>
>                 at
> org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSet(DefaultDataValueSetService.java:613)
>
>                 at
> org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSetCsv(DefaultDataValueSetService.java:388)
>
>                 at
> org.hisp.dhis.importexport.action.util.ImportDataValueTask.run(ImportDataValueTask.java:78)
>
>                 at
> org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53)
>
>                 at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>                 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>                 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>
>                 at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>                 at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value
> violates unique constraint "datavalue_pkey"
>
>   Detail: Key (dataelementid, periodid, sourceid, categoryoptioncomboid,
> attributeoptioncomboid)=(2443, 2079, 59, 16, 16) already exists.
>
>                 at
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)
>
>                 at
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)
>
>                 at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>
>                 at
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)
>
>                 at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)
>
>                 at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:331)
>
>                 at
> org.amplecode.quick.batchhandler.AbstractBatchHandler.flush(AbstractBatchHandler.java:295)
>
>                 ... 11 more
>
> (DefaultDataValueSetService.java [taskScheduler-6])
>
> * INFO  2014-09-16 13:47:12,396 [Level: ERROR, category: DATAVALUE_IMPORT,
> time: Tue Sep 16 13:47:12 CAT 2014, message: Process failed: Failed to
> flush BatchHandler] (InMemoryNotifier.java [taskScheduler-6])
>
> * ERROR 2014-09-16 13:47:15,261 Left side ($summary.conflicts.size()) of
> '>' operation has null value at
> /dhis-web-importexport/importSummary.vm[line 35, column 33]
> (Log4JLogChute.java [http-bio-8080-exec-2])
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>
>
>
>
> --
>
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+46764147049 <+46764147049>
>



-- 
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+46764147049

Follow ups

References