← Back to team overview

dhis2-users team mailing list archive

Re: Serious problem with duplicate records in Individual records/tracker

 

Seems like what we also experienced a lot for tLiberia, I think.

Knut

On Fri, Oct 9, 2015 at 11:13 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:

> If you experience reasonably high latency (slow response times) it seems
> quite likely you might well get duplicates. Probably related to
>
> https://bugs.launchpad.net/dhis2/+bug/1498505
>
> On 9 October 2015 at 15:40, Wilson, Randy <rwilson@xxxxxxx> wrote:
>
>> I will check but when i remove the duplicates analytics runs. Also
>> usually we get many more duplicates 100 or more each time I check.
>>
>> Randy
>> On Oct 9, 2015 3:48 PM, "Ant Snyman" <ant@xxxxxxxx> wrote:
>>
>>> Hi Randy,
>>> There is actually a constraint on the trackedentitydatavalue table and
>>> you cannot create two entries with the same programstageinstance and
>>> dataelementid. I notice that you are grouping on dataelement name, is it
>>> possible that you could have two dataelements with the same name in your
>>> dataelement table? I.e different dataelementids but the same name?
>>> Regards
>>> Ant
>>>
>>> On 9 October 2015 at 10:22, Wilson, Randy <rwilson@xxxxxxx> wrote:
>>>
>>>> Hi all,
>>>>
>>>> We have an big issue with several of Rwanda's event reporting systems
>>>> that have been set up using Tracker/Individual records.  For some reason,
>>>> users are able to create duplicate records in the system through normal
>>>> data entry.  It seems to happen when they enter a value for an attribute
>>>> and then a couple of seconds or milliseconds later they change that value.
>>>> The system then stores both records with last updated several milliseconds
>>>> apart.
>>>>
>>>> This is a big issue because analytics fails to run when it finds these
>>>> duplicates.
>>>>
>>>> Here is the query that finds the duplicates:
>>>>
>>>> select programstageinstanceid, dataelementname, count(*) as numrecs
>>>> from
>>>> (
>>>> SELECT
>>>>   program.name AS programname,
>>>>   program.description,
>>>>   programinstance.dateofincident,
>>>>   programinstance.trackedentityinstanceid,
>>>>   programinstance.status,
>>>>   programstageinstance.programstageinstanceid,
>>>>   programstageinstance.programinstanceid,
>>>>   programstageinstance.programstageid,
>>>>   trackedentitydatavalue.value,
>>>>   dataelement.name AS dataelementname
>>>> FROM
>>>>   public.programstageinstance,
>>>>   public.programinstance,
>>>>   public.program,
>>>>   public.trackedentitydatavalue,
>>>>   public.dataelement
>>>> WHERE
>>>>   programstageinstance.programinstanceid =
>>>> programinstance.programinstanceid AND
>>>>   program.programid = programinstance.programid AND
>>>>   trackedentitydatavalue.programstageinstanceid =
>>>> programstageinstance.programstageinstanceid AND
>>>>   dataelement.dataelementid = trackedentitydatavalue.dataelementid
>>>>   ) as foo
>>>>   group by programstageinstanceid, dataelementname
>>>>   having count(*) >1
>>>>   order by numrecs desc;
>>>>
>>>> Here is the typical output of the query when there are duplicate
>>>> records:
>>>>
>>>> Programstageinstanceid, dataelementname,numrecs
>>>> 777921;"TB Sample type_morning Sputum";2
>>>> 776886;"TB Date of collection_sample";2
>>>> 773132;"TB History of contact with TPB+ case";2
>>>>
>>>> Here is an example of the duplicates we see when we select the
>>>> duplicated records from the trackedentitydatavalue table.
>>>>
>>>> prograstageinstanceid,
>>>> dataelementid,value,lastupdated,providedelsewhere,user
>>>> 773132;35105;"true";"2015-10-02 09:32:48.334";FALSE;"abibarimana"
>>>> 773132;35105;"true";"2015-10-02 09:32:48.508";FALSE;"abibarimana"
>>>> 776886;31879;"2015-07-17";"2015-10-07 16:25:26.341";FALSE;"mugeni"
>>>> 776886;31879;"2015-07-14";"2015-10-07 16:25:26.546";FALSE;"mugeni"
>>>> 777921;35106;"true";"2015-10-08 09:55:36.133";FALSE;"emufasha"
>>>> 777921;35106;"true";"2015-10-08 09:55:36.278";FALSE;"emufasha"
>>>>
>>>> In order to clean up the trackedentitydatavalue table, I need to write
>>>> these records to a temporary table, delete them
>>>> from trackedentitydatavalue, then decide which of the duplicate pairs to
>>>> delete before copying the correct ones back into trackedentitydatavalues.
>>>>
>>>> This error seems to happen almost daily when the systems are in full
>>>> use.
>>>>
>>>> Is there a way we can put a constraint on the trackedentitydatavalue
>>>> table so that it will not accept duplicate values for
>>>> programstageinstanceid and dataelementid?  I also think there must be an
>>>> error in the java code that should be updating the first saved record with
>>>> the one entered by the user later rather than inserting a new record.
>>>>
>>>> This is a major bug because it causes analytics to fail and when it
>>>> fails we cannot run any event reports to get data out of the system.
>>>>
>>>> --
>>>> *Randy Wilson*
>>>> *Team Leader: **Knowledge Management, Data Use and Research*
>>>> Rwanda Health System Strengthening Activity
>>>> Management Sciences for Health
>>>> Rwanda-Kigali
>>>> Direct: +250 788308835
>>>> E-mail: rwilson@xxxxxxx
>>>> Skype: wilsonrandy_us
>>>> <http://www.msh.org/>
>>>> Stronger health systems. Greater health impact.
>>>> <https://www.facebook.com/ManagementSciencesForHealth>
>>>> <https://twitter.com/MSHHealthImpact>
>>>> <https://www.youtube.com/user/MSHHealthImpact>
>>>> www.msh.org
>>>>
>>>> *This message and its attachments are confidential and solely for the
>>>> intended recipients. If received in error, please delete them and notify
>>>> the sender via reply e-mail immediately.*
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Ant Snyman*
>>>
>>> *Cell: 0824910449*
>>>
>>> *Landline: 028 2713242*
>>>
>>>
>>> Health Information Systems Program - SA
>>>
>>> *This message and any attachments are subject to a disclaimer published
>>> at http://www.hisp.org/policies.html#comms_disclaimer
>>> <http://www.hisp.org/policies.html#comms_disclaimer>.  Please read the
>>> disclaimer before opening any attachment or taking any other action in
>>> terms of this electronic transmission.  If you cannot access the
>>> disclaimer, kindly send an email to disclaimer@xxxxxxxx
>>> <disclaimer@xxxxxxxx> and a copy will be provided to you. By replying to
>>> this e-mail or opening any attachment you agree to be bound by the
>>> provisions of the disclaimer.*
>>>
>>
>> *This message and its attachments are confidential and solely for the
>> intended recipients. If received in error, please delete them and notify
>> the sender via reply e-mail immediately.*
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Knut Staring
Dept. of Informatics, University of Oslo
Norway: +4791880522
Skype: knutstar
http://dhis2.org

PNG image


Follow ups

References