← Back to team overview

dhis2-devs team mailing list archive

Re: Need a group for UNIQUE attributes for deduplication

 

Sorry, I said "(which can also be defined)", but I meant "which can also be
gradually REFINED in future releases"

On Mon, Mar 16, 2015 at 2:13 PM, Knut Staring <knutst@xxxxxxxxx> wrote:

> With Tracker, there is a high probability of getting duplicates (could be
> exact duplicates, or misspellings of name for example).
>
> To deal with this, it would be good to be able to designate SOME of the
> attributes of each person (or rather trackedentityinstance) as the ones
> really identifying a person or thing, e.g. Firstname, Lastname, Age,
> Address. So we need a way to designate a subset of all the attributes as
> input for a deduplication process, which could start by just finding exact
> matches, and subsequently be refined with introducing different kinds of
> fuzzy logic etc.
>
> And then later, we could build a GUI for human review and merger of clear
> duplicates (which can also be defined). But I suppose we initially need an
> addition to the model. So this is like the UNIQUNESS property, but not for
> just ONE attribute, but rather for a group/collection of attributes.
>
> So, it will be similar to a compound key in SQL:
> http://en.wikipedia.org/wiki/Compound_key
>
> Knut
> --
> Knut Staring
> Dept. of Informatics, University of Oslo
> Norway: +4791880522
> Skype: knutstar
> http://dhis2.org
>



-- 
Knut Staring
Dept. of Informatics, University of Oslo
Norway: +4791880522
Skype: knutstar
http://dhis2.org

References