← Back to team overview

dhis2-devs team mailing list archive

Re: Persisting data validation results

 

Stian,

We will write a JIRA issue for this - but let me just point out immediately:

1. Normal users cannot be expected to understand/use the API, so with a UI
interface to do some kind of validation reporting, this feature is much
less useful that what it could be.

2. The idea of NOT storing repeats of the same violations might be a
problem, mainly because user are often particularly interested in repeat
violations (Read: which validation violations have NOT been fixed during
the last 1 or 2 or 3 monthly validation runs? Bring out the tar and
feathers.....)

I for one would like to see a dashboard item listing
- Number of violations detected during previous validation run
- Number of violations gone (issue fixed since last run)
- Number of last run violations remaining (not fixed)
- Number of violations detected >5 times
- Number of violations detected 3-5 times
- Number of violations detected 1-2 times
etc
Another typical report would be a ranking list for data capturers - either
highest number of violations (worst performance) or lowest number (best
performance), preferably only using data value records NOT marked for
follow-up.

Regards
Calle




On 2 February 2018 at 20:41, Stian Sandvold <stian@xxxxxxxxx> wrote:

> Hi Calle,
>
> I see your points. Although persist is a techie-term, I think store and
> save might also cause some confusion in this context, so maybe we should
> add a short description in addition to clarify. It's hard to try and
> explain it completely, but an example would be:
> "Store any violations found"
> "Storing violations found during the process will allow you to generate
> analytical data based on the violations" (Also available trough api, and
> keeps track of notifications sent if validation rule has notifications. By
> not storing the violations, notifications will be sent again if checked.)
>
> If the documentation is missing or lacking in informasjon, that would be
> my fault. I'll look into it and add/update the documentation to better
> explain the feature.
>
>
> 1. To "persist any non-persisted results" seems to indicate that certain
>> parts of the data validation result ARE persisted even if this is not
>> selected. What are those?
>
>
> When persisting (storing) violations, only violations not already in the
> database (stored from a previous job) will be stored.
>
> That means if you run a job for the same rule, period and orgUnits twice,
> and the first time there are 3 violations which you persist, then the
> second time (after some data entry) you get an additional 2 violations (So
> you will actually see that there are 5 violations, but only 3 of them is
> persisted already). The non-persisted results then refer to the 2 new
> violations.
>
> What is not clear, since all results are already persisted when running a
>> scheduled task - does selecting "Persist new result" for custom validation
>> runs simply do the same as what is automatically done for scheduled
>> validation tasks?
>>
>
> Some instances does not run the scheduled job, but would still like to
> persist the violations. This is due to the instance having huge amounts of
> violations, and it would be too big of a job to actually run the scheduled
> job which has some hardcoded parameters. In their case running only for a
> small subset of their data is the only option.
>
>  2. How can users access the persisted results in the UI?
>
>
> Persisted results is named "ValidationResults", which can be accessed only
> trough the api (/api/validationResults). Additionally, based on the
> ValidationResults, the analytics job will generate a analytics table based
> on these results and this data can be accessed trough the analytics api.
> Currently there is no UI to see this data yet.
>
> 3. If stored validation results are only retrievable via the
>> ValidationResults API end-point, you need to pass an id (uid?) to retrieve
>> specific results. Is there any other way to determine specific ids than
>> listing all of them and choose?
>>
> In this case, you need an id (of type integer not a traditional uid).
> There currently doesn't exist any way to get more specific results, mainly
> because this object initially only was supposed to be used internally.
> However since it seems there is more demand to look at this information, I
> could improve the endpoint to make it more useable, including adding the
> normal uid as well. If this sounds interesting, it would be greatly
> appriciated if you created a jira issue pointing out which changes you
> would like to see, and we could see what we can do with that.
>
> Hope this answers most of your questions.
>
>
> On Fri, Feb 2, 2018 at 12:06 PM, Calle Hedberg <calle.hedberg@xxxxxxxxx>
> wrote:
>
>> Hi
>>
>> Version 2.28 had a new optional feature for Data Validation called
>> "Persist new result". Many users do not understand/know how this is
>> supposed to be used - partially because they are not familiar with IT
>> techie terms like "Persist" ("Store" or "Save" would be more
>> user-friendly), but mainly because the 2.28 release note and the
>> documentation do not really SAY anything:
>>
>> "(Optional) Select *Persist new results* to persist any non-persisted
>> results found during the analysis"
>>
>> Questions:
>>
>> 1. To "persist any non-persisted results" seems to indicate that certain
>> parts of the data validation result ARE persisted even if this is not
>> selected. What are those?
>>
>> Note also that the Dev Manual, chapter 1.24, states that
>>
>> "When running the scheduled validation task, any violations found will be
>> persisted as
>>
>> validation results. These results can be accessed trough the validation
>> result api."
>>
>> What is not clear, since all results are already persisted when running a
>> scheduled task - does selecting "Persist new result" for custom validation
>> runs simply do the same as what is automatically done for scheduled
>> validation tasks?
>>
>>
>> 2. How can users access the persisted results in the UI?
>>
>>
>> 3. If stored validation results are only retrievable via the
>> ValidationResults API end-point, you need to pass an id (uid?) to retrieve
>> specific results. Is there any other way to determine specific ids than
>> listing all of them and choose?
>>
>>
>> Regards
>>
>> Calle
>>
>>
>>
>>
>>
>> *******************************************
>>
>> Calle Hedberg
>>
>> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA
>> <https://maps.google.com/?q=46D+Alma+Road,+7700+Rosebank,+SOUTH+AFRICA&entry=gmail&source=g>
>>
>> Tel/fax (home): +27-21-685-6472 <+27%2021%20685%206472>
>>
>> Cell: +27-82-853-5352 <+27%2082%20853%205352>
>>
>> Iridium SatPhone: +8816-315-19119 <+881%206%20315%2019119>
>>
>> Email: calle.hedberg@xxxxxxxxx
>>
>> Skype: calle_hedberg
>>
>> *******************************************
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Stian Sandvold
> Software developer, DHIS2
> University of Oslo
> http://www.dhis2.org
>



-- 

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@xxxxxxxxx

Skype: calle_hedberg

*******************************************

References