← Back to team overview

dhis2-users team mailing list archive

Re: Realtime Analytics

 

Just to clarify, I was -only- talking about -aggregate- data... tracker
data is much more intensive..  of course everyone want everything to be
real-time, you will not see a push back about that from anyone.. but if we
ever go "real time", we should def start with aggregate...

-- 
Morten Olav Hansen
Senior Engineer, DHIS 2
University of Oslo
http://www.dhis2.org

On Fri, Apr 22, 2016 at 8:08 PM, Jason Pickering <
jason.p.pickering@xxxxxxxxx> wrote:

> Hi Alex,
>
> What I am saying here is the underlying data itself is not real time. If
> we were dealing with milli--second stock trades or telemetry data from a
> rocket stored  as events, the need to aggregate data in real-time is
> critical. However, we are typically dealing, event  in the best case,  are
> events which may get reported several times a day. However, the event
> itself has already "aged" and is no longer real-time. As an example: The
> patient comes, samples are taken, sent to the lab, confirmed, and reviewed
> by a clinician. All of that takes time. The event then in the best case,
> get reported shortly there after.  But what if it doesn't? What if the
> internet is down? What if there is no one to report it?
>
> Speeding up DHIS2 analytics is really easy. Buy a big powerful server, and
> call a shell script every half an hour. But, that is not necessarily going
> to improve the "real time" nature of the data. You have to look downstream
> for that. Ergo, having a system to aggregate non-real time data in real
> time, seems both pointless and non environmentally friendly to me.
>
> What does make sense is to write a simple shell script to better meet your
> use case. Its easy, but I do not think it will really make the data more
> real time. It will make it appear to be more real-time and is really quite
> easily achievable.
>
> Regards,
> Jason
>
>
> On Fri, Apr 22, 2016 at 2:58 PM, Alex Tumwesigye <atumwesigye@xxxxxxxxx>
> wrote:
>
>> Dear Jason,
>>
>> I agree to not complicating things but as we look at IDSR module as a
>> generic module with DHIS2. We may need to think twice about the whole
>> process unless we want to keep it as a separate piece managed outside the
>> main DHIS2.
>> Scripts can be written, I have no problem with that by the way but how
>> many (of those using DHIS2) can write or manage simple scripts? If we make
>> this complicated, we run a risk of people not using some of the modules.
>>
>> Alex
>>
>> On Fri, Apr 22, 2016 at 3:49 PM, Jason Pickering <
>> jason.p.pickering@xxxxxxxxx> wrote:
>>
>>> Thing is, how long does it take for the event to actually get reported?
>>>
>>> If it shows up on a person's dashboard in ten minutes, but they are out
>>> having lunch, or it has taken 3 days for the event to be reported, what's
>>> the use? Its no longer real time. Not even close to it.
>>>
>>> I agree there are use cases where things needs to be sped up, and it can
>>> be very simply with a very small curl script. You can even check and see if
>>> analytics is running first just to be sure you do not trigger it again.
>>>
>>> But lets not over-complicate things, and think about how real-time the
>>> data is which is actually being aggregated.
>>>
>>> Regards,
>>> Jason
>>>
>>>
>>> On Fri, Apr 22, 2016 at 2:43 PM, Alex Tumwesigye <atumwesigye@xxxxxxxxx>
>>> wrote:
>>>
>>>> Hi Knut,
>>>>
>>>> We are not talking of hours here, dashboards needs to be updated in
>>>> almost real time and outbreaks needed to be detected as data is entered
>>>> (once thresholds are met). So 10 minutes as an option in the dropdown for
>>>> analytics is ok for me and I would consider it ok. Adding it would make the
>>>> system almost real time and this would cater for the IDSR requirement. A
>>>> lag of 10 minutes for an update of dashboard on screen is ok.
>>>> I wish, it could be added to the system than doing it using the API
>>>> given the reasons above.
>>>>
>>>> Alex
>>>>
>>>> On Fri, Apr 22, 2016 at 3:27 PM, Knut Staring <knutst@xxxxxxxxx> wrote:
>>>>
>>>>> Hi Alex - I agree that IDSR requires "immediate" response - but I we
>>>>> are usually talking hours, not minutes, right?
>>>>>
>>>>> Knut
>>>>>
>>>>> On Fri, Apr 22, 2016 at 2:23 PM, Alex Tumwesigye <
>>>>> atumwesigye@xxxxxxxxx> wrote:
>>>>>
>>>>>> Jason/Morten,
>>>>>>
>>>>>> I agree that real time analytics would not be easily implemented. May
>>>>>> be it is time to separate Aggregate and Tracker analytics as a start but it
>>>>>> also depends on what the tracker is being used for e.g surveys may require
>>>>>> no immediate analytics but IDSR/outbreaks may require instant update of
>>>>>> analytics. We are looking at IDSR features and as I see, real time
>>>>>> analytics will be a requirement. The IDSR requires real time analytics
>>>>>> since people need to respond to outbreaks and handle outbreak responses and
>>>>>> management in real time so that you can intervene and stop the outbreak.
>>>>>>
>>>>>> Here is what I propose (just a thought from my discussion with Calle)
>>>>>>   We set a configurable (checked) variable/attribute that indicates
>>>>>> that if this variable is changed, the analytics process is started for
>>>>>> example changing population data has a very big impact on indicators that
>>>>>> depend on population,entering lab results or requests or patient updates
>>>>>> for IDSR requires immediate analytics, etc. If we had this
>>>>>> attribute/variable, then we would use it to identify the corresponding meta
>>>>>> data that might need to be / may have changed and update (through temp
>>>>>> tables) the only affected analytics tables. This way we can control the
>>>>>> load required to run analytics in realtime since the system  will only be
>>>>>> updating affected changes.
>>>>>> Using the API to trigger analytics every after X minutes may be
>>>>>> feasible but not sustainable as we do not control how many threads may be
>>>>>> running as the API call through curl does not easily get feedback/update if
>>>>>> the previous analytics process has been completed for the new one to start
>>>>>> otherwise we can end up in a forever loop if the server resources are not
>>>>>> enough.
>>>>>>
>>>>>> Alex
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 22, 2016 at 2:50 PM, Morten Olav Hansen <morten@xxxxxxxxx
>>>>>> > wrote:
>>>>>>
>>>>>>> 1 ns is fine... but i depends on what the ramifications are.. what
>>>>>>> if you start a new job every 1 ms? is that ok? what happens during table
>>>>>>> swap? is analytic blocked?
>>>>>>>
>>>>>>> Maybe this is already documented.. just curious :)
>>>>>>>
>>>>>>> --
>>>>>>> Morten Olav Hansen
>>>>>>> Senior Engineer, DHIS 2
>>>>>>> University of Oslo
>>>>>>> http://www.dhis2.org
>>>>>>>
>>>>>>> On Fri, Apr 22, 2016 at 6:48 PM, Jason Pickering <
>>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> Why not ever 1 nanosecond? Its always going to take some amount of
>>>>>>>> time, question is what is reasonable.
>>>>>>>>
>>>>>>>> One could argue that real-time analytics in an aggregate data
>>>>>>>> system is not needed. This is not real time data. Its not even close to it.
>>>>>>>> Look at Google Analytics. They provide reports once a day, and you
>>>>>>>> do not see a whole lot of people complaning. Yes, you can get some limited
>>>>>>>> real time information from this as well, but its limited. The data must be
>>>>>>>> processed first, and that takes computational time. Same with DHIS2.
>>>>>>>>
>>>>>>>> From my experience, people think they need "real time analytics"
>>>>>>>> when they really are just in a rush.  Data takes time to review and
>>>>>>>> analayze and whether its available now, 1 nanosecond from now, or 10
>>>>>>>> minutes from now, makes no difference in the end, as the amount of time
>>>>>>>> which is required to digest that information is on a totally different time
>>>>>>>> scale (hours days or weeks). Once an hour is probably easily achievable
>>>>>>>> depending on the scale of the system however.
>>>>>>>>
>>>>>>>> As for the call to the API, just create a Bash script and call it
>>>>>>>> as frequently as you like with a cron task.
>>>>>>>>
>>>>>>>> This is a very simple one, but you should really check for things
>>>>>>>> like "Is analytics already running and should I trigger another run?"
>>>>>>>>
>>>>>>>> #!/bin/sh
>>>>>>>>
>>>>>>>> /usr/bin/curl
>>>>>>>> "localhost:8080/api/resourceTables/analytics?skipResourceTables=true&lastYears=2"
>>>>>>>> -X POST -u admin:district >/dev/null 2>&1
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 22, 2016 at 1:34 PM, Morten Olav Hansen <
>>>>>>>> morten@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>> Well, it sounds like a bad solution for me ;) if every 10 min
>>>>>>>>> works fine.. why not every 5 min.. why not every 1 min..
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Morten Olav Hansen
>>>>>>>>> Senior Engineer, DHIS 2
>>>>>>>>> University of Oslo
>>>>>>>>> http://www.dhis2.org
>>>>>>>>>
>>>>>>>>> On Fri, Apr 22, 2016 at 6:31 PM, Ibrahim Bayoh <
>>>>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>>> @Jason,Knut and Morten, Having analytics table run every 10 mins
>>>>>>>>>> sounds like a good place to start. but am kind of not sure how to implement
>>>>>>>>>> this with the API calls and am sure this is not possible through the user
>>>>>>>>>> interface. If you guys can point me in the right direction with an example
>>>>>>>>>> of some sort that will greatly helpful and highly appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 22, 2016 at 11:23 AM, Jason Pickering <
>>>>>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>>> No they are swapped now, so first they are built as temp tables,
>>>>>>>>>>> then swapped. So, this is the brief point in time in which things may not
>>>>>>>>>>> be available.
>>>>>>>>>>>
>>>>>>>>>>> Every 10 minutes might be OK, depending on your server, loading,
>>>>>>>>>>> and amount of data. It just requires some experimentation.
>>>>>>>>>>>
>>>>>>>>>>> Point is, "real time" analytics is not possible. Near-real time
>>>>>>>>>>> may be.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 22, 2016 at 1:03 PM, Morten Olav Hansen <
>>>>>>>>>>> morten@xxxxxxxxx> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> But I thought all analytic tables was cleared out during
>>>>>>>>>>>> re-generation? is this not true?
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Morten Olav Hansen
>>>>>>>>>>>> Senior Engineer, DHIS 2
>>>>>>>>>>>> University of Oslo
>>>>>>>>>>>> http://www.dhis2.org
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Apr 22, 2016 at 6:00 PM, Knut Staring <knutst@xxxxxxxxx
>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Depends on the data. I trigger analytics every 10 min.
>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>> Initially i thought DHIS2 had fully or out-of-the-box
>>>>>>>>>>>>>  realtime analytics and Dashboards. But after further inspection i realized
>>>>>>>>>>>>> that current data entry will not available for analysis or dashboards until
>>>>>>>>>>>>> the next day or by manually doing export tables.  This is a real bottle
>>>>>>>>>>>>> neck to the intended implementation am working on. I have looked at
>>>>>>>>>>>>> Scheduling, but the options for analytics table is not ideal in my case. *Is
>>>>>>>>>>>>> there way this can be reduced or removed to gain real-time dashboards and
>>>>>>>>>>>>> analytics?*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Bayoh.
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Ibrahim Rashid Bayoh
>>>>>>>>>>>>> Information Systems Coordinator,
>>>>>>>>>>>>> eHealth Africa(Sierra Leone)
>>>>>>>>>>>>> *117 Wilkinson Rd, Freetown, Sierra Leone*
>>>>>>>>>>>>> Mobile: +232 88-765-638
>>>>>>>>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>> http://ehealthafrica.org/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jason P. Pickering
>>>>>>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>>>>>>> tel:+46764147049
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ibrahim Rashid Bayoh
>>>>>>>>>> Information Systems Coordinator,
>>>>>>>>>> eHealth Africa(Sierra Leone)
>>>>>>>>>> *117 Wilkinson Rd, Freetown, Sierra Leone*
>>>>>>>>>> Mobile: +232 88-765-638
>>>>>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>> http://ehealthafrica.org/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jason P. Pickering
>>>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>>>> tel:+46764147049
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Alex Tumwesigye
>>>>>>
>>>>>> Technical Advisor - DHIS2 (Consultant),
>>>>>> Ministry of Health/AFENET
>>>>>> Kampala
>>>>>> Uganda
>>>>>> +256 774149 775, + 256 759 800161
>>>>>> Skype ID: talexie
>>>>>>
>>>>>> IT Consultant (Servers, Networks and Security, Health Information
>>>>>> Systems - DHIS2, Disease Outbreak & Surveillance Systems) & Solar Consultant
>>>>>>
>>>>>>
>>>>>> "I don't want to be anything other than what I have been - one tree
>>>>>> hill "
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Knut Staring
>>>>> Dept. of Informatics, University of Oslo
>>>>> Norway: +4791880522
>>>>> Skype: knutstar
>>>>> http://dhis2.org
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Alex Tumwesigye
>>>>
>>>> Technical Advisor - DHIS2 (Consultant),
>>>> Ministry of Health/AFENET
>>>> Kampala
>>>> Uganda
>>>> +256 774149 775, + 256 759 800161
>>>> Skype ID: talexie
>>>>
>>>> IT Consultant (Servers, Networks and Security, Health Information
>>>> Systems - DHIS2, Disease Outbreak & Surveillance Systems) & Solar Consultant
>>>>
>>>>
>>>> "I don't want to be anything other than what I have been - one tree
>>>> hill "
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>>
>>> --
>>> Jason P. Pickering
>>> email: jason.p.pickering@xxxxxxxxx
>>> tel:+46764147049
>>>
>>
>>
>>
>> --
>> Alex Tumwesigye
>>
>> Technical Advisor - DHIS2 (Consultant),
>> Ministry of Health/AFENET
>> Kampala
>> Uganda
>> +256 774149 775, + 256 759 800161
>> Skype ID: talexie
>>
>> IT Consultant (Servers, Networks and Security, Health Information Systems
>> - DHIS2, Disease Outbreak & Surveillance Systems) & Solar Consultant
>>
>>
>> "I don't want to be anything other than what I have been - one tree hill "
>>
>
>
>
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+46764147049
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>
>

References