← Back to team overview

dhis2-users team mailing list archive

Re: Realtime Analytics

 

Sure makes sense I think and it sounds like a good feature (aggregate only
this leaf node and downwards). Blurprint?

Problem is, if you make analytics in one branch of the hierarchy, what
about the ones above it?

Also, there are other dimensions other than the organiation unit to
consider, like "dataset". Maybe certain data sets should be aggregated more
often than others. It would seem to make little sense to aggregate a yearly
dataset once an hour, when it is only entered once a year. It might make
sense to aggregate however a daily dataset a few times a day.

This is really about "dirty" analytics. Only aggregating processing what is
needed. Right now, the process is not that efficient, as a lot of things
get aggregated which have not changed. But determinering what is dirty is
not that simple really.

At any rate, one solution is simply the brute-force approach. Get a very
database fast server seperate from your application server with lots of RAM
, and you can probably run analytics as often as you like, within reason.



On Fri, Apr 22, 2016 at 1:05 PM, Morten Olav Hansen <morten@xxxxxxxxx>
wrote:

> Jason
>
> From my perspective.. what they want, is to see "aggregated" data from
> their leaf only.. so there is not really anything to do, any leaf based
> analytics could (should) be done real-time right?
>
> Basically what most people are asking for is:
> 1) Enter data
> 2) give me report
>
> They don't say.. give me report for full country... they don't care about
> that, it's all about their leaf node..
>
> --
> Morten Olav Hansen
> Senior Engineer, DHIS 2
> University of Oslo
> http://www.dhis2.org
>
> On Fri, Apr 22, 2016 at 6:03 PM, Jason Pickering <
> jason.p.pickering@xxxxxxxxx> wrote:
>
>> Hi,
>> There may be a brief window of time when certain resources are not
>> available. This happens when the analytics tables are switched. Lars can
>> explain more about this. But it should be very brief.
>>
>> This API endpoint simply triggers the analytics, and and allows more
>> control over what is actually done. So, you might aggregate data once an
>> hour  throughout the day (near real time), then run a full analytics run at
>> night per usual.
>>
>> I think what you want, as many people have asked for, is to enter your
>> data and immediately see it instantaneously in the Pivots It does not work
>> like that, as the data has to be aggregated, and indexed to make it highly
>> available. This is the reason that the pivots are usually very fast,
>> because the data has been pre-processed from the raw data, and heavily
>> indexed to speed up the query from the analytics tables. Its a tradeoff
>> between a highly available server for everyone, or choking the server with
>> lots of expensive real-time aggregation requests.
>>
>> Usually, this is not the "normal" workflow. People enter monthly data for
>> instance once a month, and whether it gets aggregated one second or one
>> hour after, should not make a difference. So whether they see the data the
>> instant it was entered is not important. What is important is being able to
>> serve lots of data to many people, so that is why the analytics solution
>> exists. Data is pre-processed to make it quickly readable. The downside of
>> this, there is a lag between when the data is entered and when it is
>> actually available through the analytics resources.
>>
>> Having said all of that, the developers are looking into ways to speed
>> this process up, but for now, for instance by only aggregating what is
>> actually needed (dirty data) along with offloading the analytics onto a
>> separate server. But that is in the pipeline. For now, the best thing which
>> you can do is to be sure you get a very fast database server with lots of
>> RAM, and schedule the analytics run (simply by making a curl call to that
>> API endpoint) once every X minutes. X minutes will depend on a number of
>> factors, like how much data you have, how powerful your server is, etc. So,
>> you will need to experiment a bit.
>>
>> Regards,
>> Jason
>>
>>
>> On Fri, Apr 22, 2016 at 12:51 PM, Ibrahim Bayoh <
>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>>> Thanks Jason for your prompt response, It seems to be the api for using
>>> the manual export tables and the scheduling Analytics table externally;
>>> please correct me if am wrong.
>>>
>>>
>>> On Fri, Apr 22, 2016 at 10:40 AM, Morten Olav Hansen <morten@xxxxxxxxx>
>>> wrote:
>>>
>>>> Hi Jason
>>>>
>>>> While this is happening, are reports still available or not?
>>>>
>>>> --
>>>> Morten Olav Hansen
>>>> Senior Engineer, DHIS 2
>>>> University of Oslo
>>>> http://www.dhis2.org
>>>>
>>>> On Fri, Apr 22, 2016 at 5:31 PM, Jason Pickering <
>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>
>>>>> Hi Bayoh,
>>>>> Have a look here
>>>>>
>>>>> http://dhis2.github.io/dhis2-docs/2.22/en/developer/html/ch01s33.html
>>>>>
>>>>> You can acheive near-real time analytics by ensuring you have enough
>>>>> horsepower in your servers and only aggregating smaller pieces of the data
>>>>> (i.e. last year only).
>>>>>
>>>>> Regards,
>>>>> Jason
>>>>>
>>>>>
>>>>> On Fri, Apr 22, 2016 at 12:19 PM, Ibrahim Bayoh <
>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>>> Hi All,
>>>>>> Initially i thought DHIS2 had fully or out-of-the-box  realtime
>>>>>> analytics and Dashboards. But after further inspection i realized that
>>>>>> current data entry will not available for analysis or dashboards until the
>>>>>> next day or by manually doing export tables.  This is a real bottle neck to
>>>>>> the intended implementation am working on. I have looked at Scheduling, but
>>>>>> the options for analytics table is not ideal in my case. *Is there
>>>>>> way this can be reduced or removed to gain real-time dashboards and
>>>>>> analytics?*
>>>>>>
>>>>>> Thanks,
>>>>>> Bayoh.
>>>>>>
>>>>>> --
>>>>>> Ibrahim Rashid Bayoh
>>>>>> Information Systems Coordinator,
>>>>>> eHealth Africa(Sierra Leone)
>>>>>> *117 Wilkinson Rd, Freetown, Sierra Leone*
>>>>>> Mobile: +232 88-765-638
>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx
>>>>>> http://ehealthafrica.org/
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jason P. Pickering
>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>> tel:+46764147049
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Ibrahim Rashid Bayoh
>>> Information Systems Coordinator,
>>> eHealth Africa(Sierra Leone)
>>> *117 Wilkinson Rd, Freetown, Sierra Leone*
>>> Mobile: +232 88-765-638
>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx
>>> http://ehealthafrica.org/
>>>
>>>
>>
>>
>> --
>> Jason P. Pickering
>> email: jason.p.pickering@xxxxxxxxx
>> tel:+46764147049
>>
>
>


-- 
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+46764147049

References