dhis2-users team mailing list archive
-
dhis2-users team
-
Mailing list archive
-
Message #10013
Re: Realtime Analytics
Hi Alex,
What I am saying here is the underlying data itself is not real time. If we
were dealing with milli--second stock trades or telemetry data from a
rocket stored as events, the need to aggregate data in real-time is
critical. However, we are typically dealing, event in the best case, are
events which may get reported several times a day. However, the event
itself has already "aged" and is no longer real-time. As an example: The
patient comes, samples are taken, sent to the lab, confirmed, and reviewed
by a clinician. All of that takes time. The event then in the best case,
get reported shortly there after. But what if it doesn't? What if the
internet is down? What if there is no one to report it?
Speeding up DHIS2 analytics is really easy. Buy a big powerful server, and
call a shell script every half an hour. But, that is not necessarily going
to improve the "real time" nature of the data. You have to look downstream
for that. Ergo, having a system to aggregate non-real time data in real
time, seems both pointless and non environmentally friendly to me.
What does make sense is to write a simple shell script to better meet your
use case. Its easy, but I do not think it will really make the data more
real time. It will make it appear to be more real-time and is really quite
easily achievable.
Regards,
Jason
On Fri, Apr 22, 2016 at 2:58 PM, Alex Tumwesigye <atumwesigye@xxxxxxxxx>
wrote:
> Dear Jason,
>
> I agree to not complicating things but as we look at IDSR module as a
> generic module with DHIS2. We may need to think twice about the whole
> process unless we want to keep it as a separate piece managed outside the
> main DHIS2.
> Scripts can be written, I have no problem with that by the way but how
> many (of those using DHIS2) can write or manage simple scripts? If we make
> this complicated, we run a risk of people not using some of the modules.
>
> Alex
>
> On Fri, Apr 22, 2016 at 3:49 PM, Jason Pickering <
> jason.p.pickering@xxxxxxxxx> wrote:
>
>> Thing is, how long does it take for the event to actually get reported?
>>
>> If it shows up on a person's dashboard in ten minutes, but they are out
>> having lunch, or it has taken 3 days for the event to be reported, what's
>> the use? Its no longer real time. Not even close to it.
>>
>> I agree there are use cases where things needs to be sped up, and it can
>> be very simply with a very small curl script. You can even check and see if
>> analytics is running first just to be sure you do not trigger it again.
>>
>> But lets not over-complicate things, and think about how real-time the
>> data is which is actually being aggregated.
>>
>> Regards,
>> Jason
>>
>>
>> On Fri, Apr 22, 2016 at 2:43 PM, Alex Tumwesigye <atumwesigye@xxxxxxxxx>
>> wrote:
>>
>>> Hi Knut,
>>>
>>> We are not talking of hours here, dashboards needs to be updated in
>>> almost real time and outbreaks needed to be detected as data is entered
>>> (once thresholds are met). So 10 minutes as an option in the dropdown for
>>> analytics is ok for me and I would consider it ok. Adding it would make the
>>> system almost real time and this would cater for the IDSR requirement. A
>>> lag of 10 minutes for an update of dashboard on screen is ok.
>>> I wish, it could be added to the system than doing it using the API
>>> given the reasons above.
>>>
>>> Alex
>>>
>>> On Fri, Apr 22, 2016 at 3:27 PM, Knut Staring <knutst@xxxxxxxxx> wrote:
>>>
>>>> Hi Alex - I agree that IDSR requires "immediate" response - but I we
>>>> are usually talking hours, not minutes, right?
>>>>
>>>> Knut
>>>>
>>>> On Fri, Apr 22, 2016 at 2:23 PM, Alex Tumwesigye <atumwesigye@xxxxxxxxx
>>>> > wrote:
>>>>
>>>>> Jason/Morten,
>>>>>
>>>>> I agree that real time analytics would not be easily implemented. May
>>>>> be it is time to separate Aggregate and Tracker analytics as a start but it
>>>>> also depends on what the tracker is being used for e.g surveys may require
>>>>> no immediate analytics but IDSR/outbreaks may require instant update of
>>>>> analytics. We are looking at IDSR features and as I see, real time
>>>>> analytics will be a requirement. The IDSR requires real time analytics
>>>>> since people need to respond to outbreaks and handle outbreak responses and
>>>>> management in real time so that you can intervene and stop the outbreak.
>>>>>
>>>>> Here is what I propose (just a thought from my discussion with Calle)
>>>>> We set a configurable (checked) variable/attribute that indicates
>>>>> that if this variable is changed, the analytics process is started for
>>>>> example changing population data has a very big impact on indicators that
>>>>> depend on population,entering lab results or requests or patient updates
>>>>> for IDSR requires immediate analytics, etc. If we had this
>>>>> attribute/variable, then we would use it to identify the corresponding meta
>>>>> data that might need to be / may have changed and update (through temp
>>>>> tables) the only affected analytics tables. This way we can control the
>>>>> load required to run analytics in realtime since the system will only be
>>>>> updating affected changes.
>>>>> Using the API to trigger analytics every after X minutes may be
>>>>> feasible but not sustainable as we do not control how many threads may be
>>>>> running as the API call through curl does not easily get feedback/update if
>>>>> the previous analytics process has been completed for the new one to start
>>>>> otherwise we can end up in a forever loop if the server resources are not
>>>>> enough.
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 22, 2016 at 2:50 PM, Morten Olav Hansen <morten@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> 1 ns is fine... but i depends on what the ramifications are.. what if
>>>>>> you start a new job every 1 ms? is that ok? what happens during table swap?
>>>>>> is analytic blocked?
>>>>>>
>>>>>> Maybe this is already documented.. just curious :)
>>>>>>
>>>>>> --
>>>>>> Morten Olav Hansen
>>>>>> Senior Engineer, DHIS 2
>>>>>> University of Oslo
>>>>>> http://www.dhis2.org
>>>>>>
>>>>>> On Fri, Apr 22, 2016 at 6:48 PM, Jason Pickering <
>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>
>>>>>>> Why not ever 1 nanosecond? Its always going to take some amount of
>>>>>>> time, question is what is reasonable.
>>>>>>>
>>>>>>> One could argue that real-time analytics in an aggregate data system
>>>>>>> is not needed. This is not real time data. Its not even close to it.
>>>>>>> Look at Google Analytics. They provide reports once a day, and you
>>>>>>> do not see a whole lot of people complaning. Yes, you can get some limited
>>>>>>> real time information from this as well, but its limited. The data must be
>>>>>>> processed first, and that takes computational time. Same with DHIS2.
>>>>>>>
>>>>>>> From my experience, people think they need "real time analytics"
>>>>>>> when they really are just in a rush. Data takes time to review and
>>>>>>> analayze and whether its available now, 1 nanosecond from now, or 10
>>>>>>> minutes from now, makes no difference in the end, as the amount of time
>>>>>>> which is required to digest that information is on a totally different time
>>>>>>> scale (hours days or weeks). Once an hour is probably easily achievable
>>>>>>> depending on the scale of the system however.
>>>>>>>
>>>>>>> As for the call to the API, just create a Bash script and call it as
>>>>>>> frequently as you like with a cron task.
>>>>>>>
>>>>>>> This is a very simple one, but you should really check for things
>>>>>>> like "Is analytics already running and should I trigger another run?"
>>>>>>>
>>>>>>> #!/bin/sh
>>>>>>>
>>>>>>> /usr/bin/curl
>>>>>>> "localhost:8080/api/resourceTables/analytics?skipResourceTables=true&lastYears=2"
>>>>>>> -X POST -u admin:district >/dev/null 2>&1
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 22, 2016 at 1:34 PM, Morten Olav Hansen <
>>>>>>> morten@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> Well, it sounds like a bad solution for me ;) if every 10 min works
>>>>>>>> fine.. why not every 5 min.. why not every 1 min..
>>>>>>>>
>>>>>>>> --
>>>>>>>> Morten Olav Hansen
>>>>>>>> Senior Engineer, DHIS 2
>>>>>>>> University of Oslo
>>>>>>>> http://www.dhis2.org
>>>>>>>>
>>>>>>>> On Fri, Apr 22, 2016 at 6:31 PM, Ibrahim Bayoh <
>>>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>> @Jason,Knut and Morten, Having analytics table run every 10 mins
>>>>>>>>> sounds like a good place to start. but am kind of not sure how to implement
>>>>>>>>> this with the API calls and am sure this is not possible through the user
>>>>>>>>> interface. If you guys can point me in the right direction with an example
>>>>>>>>> of some sort that will greatly helpful and highly appreciated.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> On Fri, Apr 22, 2016 at 11:23 AM, Jason Pickering <
>>>>>>>>> jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>>> No they are swapped now, so first they are built as temp tables,
>>>>>>>>>> then swapped. So, this is the brief point in time in which things may not
>>>>>>>>>> be available.
>>>>>>>>>>
>>>>>>>>>> Every 10 minutes might be OK, depending on your server, loading,
>>>>>>>>>> and amount of data. It just requires some experimentation.
>>>>>>>>>>
>>>>>>>>>> Point is, "real time" analytics is not possible. Near-real time
>>>>>>>>>> may be.
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 22, 2016 at 1:03 PM, Morten Olav Hansen <
>>>>>>>>>> morten@xxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>>> But I thought all analytic tables was cleared out during
>>>>>>>>>>> re-generation? is this not true?
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Morten Olav Hansen
>>>>>>>>>>> Senior Engineer, DHIS 2
>>>>>>>>>>> University of Oslo
>>>>>>>>>>> http://www.dhis2.org
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 22, 2016 at 6:00 PM, Knut Staring <knutst@xxxxxxxxx>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Depends on the data. I trigger analytics every 10 min.
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>> Initially i thought DHIS2 had fully or out-of-the-box realtime
>>>>>>>>>>>> analytics and Dashboards. But after further inspection i realized that
>>>>>>>>>>>> current data entry will not available for analysis or dashboards until the
>>>>>>>>>>>> next day or by manually doing export tables. This is a real bottle neck to
>>>>>>>>>>>> the intended implementation am working on. I have looked at Scheduling, but
>>>>>>>>>>>> the options for analytics table is not ideal in my case. *Is
>>>>>>>>>>>> there way this can be reduced or removed to gain real-time dashboards and
>>>>>>>>>>>> analytics?*
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Bayoh.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Ibrahim Rashid Bayoh
>>>>>>>>>>>> Information Systems Coordinator,
>>>>>>>>>>>> eHealth Africa(Sierra Leone)
>>>>>>>>>>>> *117 Wilkinson Rd, Freetown, Sierra Leone*
>>>>>>>>>>>> Mobile: +232 88-765-638
>>>>>>>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>> http://ehealthafrica.org/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Jason P. Pickering
>>>>>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>>>>>> tel:+46764147049
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ibrahim Rashid Bayoh
>>>>>>>>> Information Systems Coordinator,
>>>>>>>>> eHealth Africa(Sierra Leone)
>>>>>>>>> *117 Wilkinson Rd, Freetown, Sierra Leone*
>>>>>>>>> Mobile: +232 88-765-638
>>>>>>>>> ibrahim.bayoh@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>> http://ehealthafrica.org/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jason P. Pickering
>>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>>> tel:+46764147049
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alex Tumwesigye
>>>>>
>>>>> Technical Advisor - DHIS2 (Consultant),
>>>>> Ministry of Health/AFENET
>>>>> Kampala
>>>>> Uganda
>>>>> +256 774149 775, + 256 759 800161
>>>>> Skype ID: talexie
>>>>>
>>>>> IT Consultant (Servers, Networks and Security, Health Information
>>>>> Systems - DHIS2, Disease Outbreak & Surveillance Systems) & Solar Consultant
>>>>>
>>>>>
>>>>> "I don't want to be anything other than what I have been - one tree
>>>>> hill "
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Knut Staring
>>>> Dept. of Informatics, University of Oslo
>>>> Norway: +4791880522
>>>> Skype: knutstar
>>>> http://dhis2.org
>>>>
>>>
>>>
>>>
>>> --
>>> Alex Tumwesigye
>>>
>>> Technical Advisor - DHIS2 (Consultant),
>>> Ministry of Health/AFENET
>>> Kampala
>>> Uganda
>>> +256 774149 775, + 256 759 800161
>>> Skype ID: talexie
>>>
>>> IT Consultant (Servers, Networks and Security, Health Information
>>> Systems - DHIS2, Disease Outbreak & Surveillance Systems) & Solar Consultant
>>>
>>>
>>> "I don't want to be anything other than what I have been - one tree hill
>>> "
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~dhis2-users
>>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>>
>> --
>> Jason P. Pickering
>> email: jason.p.pickering@xxxxxxxxx
>> tel:+46764147049
>>
>
>
>
> --
> Alex Tumwesigye
>
> Technical Advisor - DHIS2 (Consultant),
> Ministry of Health/AFENET
> Kampala
> Uganda
> +256 774149 775, + 256 759 800161
> Skype ID: talexie
>
> IT Consultant (Servers, Networks and Security, Health Information Systems
> - DHIS2, Disease Outbreak & Surveillance Systems) & Solar Consultant
>
>
> "I don't want to be anything other than what I have been - one tree hill "
>
--
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+46764147049
Follow ups
References
-
Realtime Analytics
From: Ibrahim Bayoh, 2016-04-22
-
Re: Realtime Analytics
From: Knut Staring, 2016-04-22
-
Re: Realtime Analytics
From: Morten Olav Hansen, 2016-04-22
-
Re: Realtime Analytics
From: Jason Pickering, 2016-04-22
-
Re: Realtime Analytics
From: Ibrahim Bayoh, 2016-04-22
-
Re: Realtime Analytics
From: Morten Olav Hansen, 2016-04-22
-
Re: Realtime Analytics
From: Jason Pickering, 2016-04-22
-
Re: Realtime Analytics
From: Morten Olav Hansen, 2016-04-22
-
Re: Realtime Analytics
From: Alex Tumwesigye, 2016-04-22
-
Re: Realtime Analytics
From: Knut Staring, 2016-04-22
-
Re: Realtime Analytics
From: Alex Tumwesigye, 2016-04-22
-
Re: Realtime Analytics
From: Jason Pickering, 2016-04-22
-
Re: Realtime Analytics
From: Alex Tumwesigye, 2016-04-22