← Back to team overview

dhis2-devs team mailing list archive

Re: DHIS2 - Import data from CSV - bug

 

Hi Robin,

I have seem this happen in Excel before. AFAIK, there is no way to check
for discordant duplicates, which have the same orgunit/period/data
element/category combo but different values. Excel will not detect these as
being duplicates, if you include the value field as part of the criteria.
It will only work if the values are different. I suspect in your case, you
have values which are not the same, and thus, the import will fail as a
single batch but possibly work as multiple batches, depending on whether or
not the duplicates happen to be part of different batches.

Regards,
Jason


On Tue, Sep 16, 2014 at 4:08 PM, Lars Helge Øverland <larshelge@xxxxxxxxx>
wrote:

> Hi Robin,
>
> usually the problem is internal duplicates in the import file like Olav
> says.
>
> Importing piece by piece is not really proof in this case - if the
> duplicates are spread across two pieces then the second piece will simply
> update the record imported in the first instead of crashing.
>
> We do imports in SQL bulks to improve performance and hence its a bit
> tricky to check for duplicates.
>
> regards,
>
> Lars
>
>
>
>
>
>
>
> On Tue, Sep 16, 2014 at 3:59 PM, Robin Martens <martens@xxxxxxx> wrote:
>
>>  Hi Olav,
>>
>>
>>
>> Thanks for the answer but I don't think that's the issue.
>>
>>
>>
>> First of all, when checking my CSV with Excel's "Duplicate data"
>> functionality there's no duplicates found.
>>
>> Secondly, when importing the same CSV file but piece by piece (per data
>> element-catoptcombo set) I don't have this issue.
>>
>>
>>
>> Regards,
>>
>>
>>
>> Robin
>>
>>
>>
>> *From:* Olav Poppe [mailto:olav.poppe@xxxxxx]
>> *Sent:* 16 September 2014 15:26
>> *To:* Robin Martens
>> *Cc:* dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> *Subject:* Re: [Dhis2-devs] DHIS2 - Import data from CSV - bug
>>
>>
>>
>> Hi,
>>
>> I see that your log file actually confirms this:
>>
>> Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value
>> violates unique constraint "datavalue_pkey"
>>
>>   Detail: Key (dataelementid, periodid, sourceid, categoryoptioncomboid,
>> attributeoptioncomboid)=(2443, 2079, 59, 16, 16) already exists.
>>
>>
>>
>> Olav
>>
>>
>>
>>
>>
>>
>>
>>  16. sep. 2014 kl. 15.23 skrev Olav Poppe <olav.poppe@xxxxxx>:
>>
>>
>>
>> Hi,
>>
>> my experience is that when this happens (dry run works, but actual import
>> fails with an error similar to this), it is *usually* caused by
>> duplicate values in the source file i.e. multiple values with the same
>> orgunit, data element, category and period.
>>
>>
>>
>> Olav
>>
>>
>>
>>
>>
>>
>>
>>  16. sep. 2014 kl. 14.22 skrev Robin Martens <martens@xxxxxxx>:
>>
>>
>>
>> Hi devs,
>>
>>
>>
>> I'm having what seems like a bug with the import module:
>>
>>
>>
>> When importing data from CSV, I get the following error message and the
>> batch doesn't run. However, this only happens when importing a data element
>> over different categoryoptioncombos and only when it is new data, not
>> updates. In other words, I'm getting my import done piece by piece, i.e.
>> per data element/categoryoptioncombo combination.
>>
>>
>>
>> Also note that the import dry run doesn't detect any issues.
>>
>>
>>
>> Any idea how to solve this?
>>
>>
>>
>> Kind regards,
>>
>>
>>
>> Robin
>>
>>
>>
>>
>>
>>
>>
>> * INFO  2014-09-16 13:47:11,377 [Level: INFO, category: DATAVALUE_IMPORT,
>> time: Tue Sep 16 13:47:11 CAT 2014, message: Process started]
>> (InMemoryNotifier.java [taskScheduler-6])
>>
>> * INFO  2014-09-16 13:47:11,469 [Level: INFO, category: DATAVALUE_IMPORT,
>> time: Tue Sep 16 13:47:11 CAT 2014, message: Importing data values]
>> (InMemoryNotifier.java [taskScheduler-6])
>>
>> * INFO  2014-09-16 13:47:11,470 importing data values
>> (DefaultDataValueSetService.java [taskScheduler-6])
>>
>> * ERROR 2014-09-16 13:47:12,396 java.lang.RuntimeException: Failed to
>> flush BatchHandler
>>
>>                 at
>> org.amplecode.quick.batchhandler.AbstractBatchHandler.flush(AbstractBatchHandler.java:311)
>>
>>                 at
>> org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSet(DefaultDataValueSetService.java:613)
>>
>>                 at
>> org.hisp.dhis.dxf2.datavalueset.DefaultDataValueSetService.saveDataValueSetCsv(DefaultDataValueSetService.java:388)
>>
>>                 at
>> org.hisp.dhis.importexport.action.util.ImportDataValueTask.run(ImportDataValueTask.java:78)
>>
>>                 at
>> org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53)
>>
>>                 at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>
>>                 at
>> java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>
>>                 at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>>
>>                 at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>>
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>                 at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>                 at java.lang.Thread.run(Thread.java:744)
>>
>> Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value
>> violates unique constraint "datavalue_pkey"
>>
>>   Detail: Key (dataelementid, periodid, sourceid, categoryoptioncomboid,
>> attributeoptioncomboid)=(2443, 2079, 59, 16, 16) already exists.
>>
>>                 at
>> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)
>>
>>                 at
>> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)
>>
>>                 at
>> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>>
>>                 at
>> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)
>>
>>                 at
>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)
>>
>>                 at
>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:331)
>>
>>                 at
>> org.amplecode.quick.batchhandler.AbstractBatchHandler.flush(AbstractBatchHandler.java:295)
>>
>>                 ... 11 more
>>
>> (DefaultDataValueSetService.java [taskScheduler-6])
>>
>> * INFO  2014-09-16 13:47:12,396 [Level: ERROR, category:
>> DATAVALUE_IMPORT, time: Tue Sep 16 13:47:12 CAT 2014, message: Process
>> failed: Failed to flush BatchHandler] (InMemoryNotifier.java
>> [taskScheduler-6])
>>
>> * ERROR 2014-09-16 13:47:15,261 Left side ($summary.conflicts.size()) of
>> '>' operation has null value at
>> /dhis-web-importexport/importSummary.vm[line 35, column 33]
>> (Log4JLogChute.java [http-bio-8080-exec-2])
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+46764147049

Follow ups

References