← Back to team overview

dhis2-users team mailing list archive

Re: Retrieving Data of type value:- Date , text and Yes or No in DHIS2

 

Hi Murod,
Just trying to keep it focused on the original topic, but it seems you want
to persist.  The initial part of this post was about getting out text data
(read the title).  In this case, there is no need for any casting or
aggregation. Period.

That regex  you post again unfortunately will also not work in many cases ,
especially if you have integers stored are decimals or negative integers
which might be perfectly valid. Please refer to the source code for a more
appropriate one. The data which I mention is not "dummy data".  We have
lots of problems with this type of data here in Zambia as well as other
countries, because most of our data is submitted through mobiles. The trunk
mobile clients do not have any validation, so getting data like "09" and
"9.0" for an integer is quite common.  We have added some regex validation
to the J2ME client to try and prevent most of this junk for getting in, but
with the new Light and Smartphone clients, it does not always work.
Normally, we run other scripts to flag these values for "follow-up"
automatically, to try and resolve what the number is, but it can take time.

 In your case, you might think that this regex suits your needs, and would
want to reject all decimals and negative numbers. That is up to you. But in
our case, we attempt to salvage as much data as possible. Whether that is
the right approach or not, well, maybe we can start a new thread for this.
 :)

Best regards,
Jason






On Fri, Nov 29, 2013 at 5:13 PM, Murod Latifov <mlatifov@xxxxxxxxx> wrote:

> Hi Jason,
>
> I thought my points were of some help to solve the issue and discuss those
> appearing from initial issue. What you thought about first call is not
> right and what I proposed was a solution. After you added dummy data being
> stored, where new topic was introduced (off initial topic). And now you are
> saying this kind of bureaucratic on topic/off topic things.
> With regards to regexp you are not bound with what I demonstrated, you can
> with the same ease, please use " and dv.value~E'^[1-9][0-9]?$'" instead if
> the case is so worth, no probs.
>
> best,
> murod
>
>
> On Fri, Nov 29, 2013 at 7:58 PM, Jason Pickering <
> jason.p.pickering@xxxxxxxxx> wrote:
>
>> Hi Murod,
>>
>> I do not think anyone is happy about it, and thus my blueprint to provide
>> server-side regex validation of data, which has yet to be implemented.
>> However, the fact of the matter is, it can happen now, and having some data
>> is usually better than having no data at all.  That regex which you mention
>> there will also not work in all cases, because things like "00" are quite
>> common and cannot be cast, thus the need for the "heavier" regex which is
>> used by the analytics process. Such values would pass the regex because it
>> is all digits, but will fail the cast.  You may not be happy about it, but
>> it is usually necessary to be sure the values which should be numeric, can
>> actually be cast to an integer.
>>
>> Again, seemingly off-topic from the original post?
>>
>> Regards,
>> Jason
>>
>>
>>
>>
>>
>> On Fri, Nov 29, 2013 at 4:51 PM, Murod Latifov <mlatifov@xxxxxxxxx>wrote:
>>
>>> Hi Jason,
>>>
>>> By telling these you mean data stored in DHIS2 at some implementations
>>> is not trusted, that is not cleaned enough. How can one use these data for
>>> analysis? Does it bring TRUE result for decision making? I don't think such
>>> data should exist. At least some Data management action could clean data
>>> against data element declarations and value stored. After, there shouldn't
>>> be such worries.
>>> You can use regexp within SQL command to:
>>> select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast
>>> (value as int)) as val, dv.sourceid as orgunit, aggregationtype from
>>> datavalue dv
>>>  left join dataelement de
>>> on dv.dataelementid = de.dataelementid
>>> WHERE aggregationtype = 'sum' and domaintype='aggregate' and
>>> dv.value~E'^\\d+$'
>>> group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid,
>>> aggregationtype
>>>
>>> Please note this addition  and dv.value~E'^\\d+$', which makes SQL run
>>> failure safe.
>>> But in overall am not happy with such data being stored without
>>> cleansing after import.
>>>
>>> best,
>>> murod
>>>
>>>
>>> On Fri, Nov 29, 2013 at 7:10 PM, Jason Pickering <
>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>
>>>> Hi Murod,
>>>> You would think this would work, and it might for you. The problem is
>>>> there is no way to "trust" that the raw data values can be cast. We have
>>>> had a lot of problems with this, but mostly because of 1) Legacy data
>>>> before the different value types were implemented and 2) data which is
>>>> received from mobile clients, which do not have the same restrictions as
>>>> from the front-end. So, by relying on the aggregationtype, you must really
>>>> know your data is clean, otherwise, it only takes one bad value for the
>>>> aggregation with SQL not to work. You can see the way the analytics gets
>>>> around this, by applying a regex. Whether it is "heavy" or not, I guess
>>>> really depends on your reference frame.
>>>>
>>>> However, getting back to the top of the thread, I thought Felix's
>>>> original issue was not being able to data which was stored as a date or
>>>> text. My original point was that the casting and summing of this type of
>>>> data is simply not possible with these simple SQL statements. But maybe I
>>>> misunderstood Felx's original issue.
>>>>
>>>> Regards,
>>>> Jason
>>>>
>>>>
>>>>
>>>> On Fri, Nov 29, 2013 at 3:15 PM, Murod Latifov <mlatifov@xxxxxxxxx>wrote:
>>>>
>>>>> Hi Jason,
>>>>>
>>>>> Yes, it holds true in case if raw data being called. But query was
>>>>> with aggregation elements. Technique I proposed will exactly remove none
>>>>> convertible values from the query, thus there won't be cast exception. No
>>>>> need to deploy heavy regex or stored procedure here, this is doable. Sample
>>>>> code here:
>>>>>
>>>>> select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast
>>>>> (value as int)) as val, dv.sourceid as orgunit, aggregationtype from
>>>>> datavalue dv
>>>>> left join dataelement de
>>>>> on dv.dataelementid = de.dataelementid
>>>>> WHERE aggregationtype = 'sum' and domaintype='aggregate'
>>>>> group by dv.dataelementid, dv.sourceid, categoryoptioncomboid,
>>>>> periodid, aggregationtype
>>>>>
>>>>> This should give an idea on how to manage avoiding these exceptions
>>>>> just with plain SQL.
>>>>>
>>>>> regards,
>>>>> murod
>>>>>
>>>>>
>>>>> On Fri, Nov 29, 2013 at 5:46 PM, Jason Pickering <
>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>
>>>>>> Well, it is very hard to see what is going on without the entire
>>>>>> query. My point here is there is no way to aggregate text, without some
>>>>>> sort of procedure, so you cannot cast anything, nor can you sum it, because
>>>>>> in the datavalue table, everything is stored as text. Just like with the
>>>>>> analytics, you must first filter out all values which you think can be cast
>>>>>> to a double with a regex, and then perform the sum/average/other
>>>>>> aggregation operator.  If you just want to get the raw data values back,
>>>>>> then there should be no aggregation anyway, thus, no need for the sum
>>>>>> operator.
>>>>>>
>>>>>> Regards,
>>>>>> Jason
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 29, 2013 at 2:39 PM, Murod Latifov <mlatifov@xxxxxxxxx>wrote:
>>>>>>
>>>>>>> I am afraid Jason, it won't work that way. In Postgres aggregating
>>>>>>> (SUM, AVG) for varchar will give exception. There is no onbuilt type
>>>>>>> casting at this point.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Nov 29, 2013 at 5:09 PM, Jason Pickering <
>>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> You cannot cast text to a double so just remove that cast operator
>>>>>>>> and you should be fine.
>>>>>>>>
>>>>>>>> --Sent from my mobile
>>>>>>>> On Nov 29, 2013 1:16 PM, "Felix Obareh" <fobareh@xxxxxxxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thank Jason,
>>>>>>>>> I have decided to go with the SQL Query. I am using the SQL Query
>>>>>>>>> example i the user manual together with the demo database for Sierra Leone.
>>>>>>>>>
>>>>>>>>> It is working fine for elements of integer type and i can run the
>>>>>>>>> query from my pgAdmin and i get them fine. I have a problem with displaying
>>>>>>>>> data elements of text and varchar type.
>>>>>>>>>
>>>>>>>>> Here is what am running as querry:-
>>>>>>>>>
>>>>>>>>> ------Trancated--------
>>>>>>>>>
>>>>>>>>> left outer join organisationunit Sublocation
>>>>>>>>>   on (ous.idlevel6=sublocation.organisationunitid)
>>>>>>>>> left outer join (
>>>>>>>>>   *select sourceid, sum(cast(value as double precision)) as value*
>>>>>>>>>   from datavalue
>>>>>>>>>   where dataelementid=8305
>>>>>>>>>
>>>>>>>>>  group by sourceid) as *triggerstate* on
>>>>>>>>> triggerstate.sourceid=ou.organisationunitid
>>>>>>>>> where ous.level=7
>>>>>>>>>
>>>>>>>>> ****If i run the same query for a data element of date of text
>>>>>>>>> type or yes/no
>>>>>>>>> ***NB the triggerstate store a true or false value in the the
>>>>>>>>> datavalue table
>>>>>>>>>
>>>>>>>>> *HERE is the SQL error i get*
>>>>>>>>>
>>>>>>>>> ERROR:  invalid input syntax for type double precision: "false"
>>>>>>>>>
>>>>>>>>> ********** Error **********
>>>>>>>>>
>>>>>>>>> ERROR: invalid input syntax for type double precision: "false"
>>>>>>>>> SQL state: 22P02
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----------------------------------
>>>>>>>>> Please assist
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Nov 27, 2013 at 5:44 PM, Jason Pickering <
>>>>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Obare,
>>>>>>>>>> The reason for this is because you will only be able to see the
>>>>>>>>>> aggregated data in DHIS2. In your case, it does not seem you need the
>>>>>>>>>> aggregated data, but instead the raw data. One way to do this is with a
>>>>>>>>>> custom report, which would use a JDBC data source, and then an SQL query to
>>>>>>>>>> retreive the data. You might need some paramaters like the period and the
>>>>>>>>>> particular orgunit. You can read about custom reports in the DHIS2
>>>>>>>>>> documentation. Another possible solution would be an SQL Query, which would
>>>>>>>>>> allow you to pull out the raw data into a CSV file for subsequent analysis.
>>>>>>>>>>
>>>>>>>>>> Hope this helps.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Jason
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Nov 27, 2013 at 2:34 PM, Felix Obareh <
>>>>>>>>>> fobareh@xxxxxxxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I a form similar to the Facility Assessment Form in the
>>>>>>>>>>> demo.dhis2.org . I have done some data entry but i don't know
>>>>>>>>>>> how to get the report for that form
>>>>>>>>>>>
>>>>>>>>>>> Among my Elements are:-
>>>>>>>>>>>
>>>>>>>>>>> 1) Date Facility was commissioned
>>>>>>>>>>> 2) Facility state on Commissioning with options set of ( Ready,
>>>>>>>>>>> Lacking Some Departments, Not ready)
>>>>>>>>>>> 3) Date of Certification
>>>>>>>>>>> 4) Date of Verification
>>>>>>>>>>> 5) Source of Funds with option sets (Government, Donor)
>>>>>>>>>>> 6) Is facility well maintained (Yes/No)
>>>>>>>>>>> 7) Key issues identified & suggested solutions (Text)
>>>>>>>>>>> 8) Number of Practitioners in the Facility
>>>>>>>>>>>
>>>>>>>>>>> I can the above data on the entry form by i cant create a single
>>>>>>>>>>> report about the facility. The pivot tables on show the value of element 8.
>>>>>>>>>>> The value of the number of practitioners.
>>>>>>>>>>>
>>>>>>>>>>> Any assistance will be highly appreciated. I have three more
>>>>>>>>>>> Facility based forms which i want to get reports for across the country
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Follow ups

References