← Back to team overview

dhis2-devs team mailing list archive

Re: [Dhis2-users] DHIS2 with R

 

On Thu, May 27, 2010 at 6:39 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
> On 27 May 2010 17:27, Knut Staring <knutst@xxxxxxxxx> wrote:
>> DHIS already has the concept of datamart and report tables, which do
>> provide some separation of the transactional from the analytical,
>> though we also have plans for improving this.
>>
>> R-node with Jquery /GeoExt looks interesting:
>> http://www.r-statistics.com/2010/04/r-node-a-web-front-end-to-r-with-protovis/
>
> Slight aside: protoviz looks like a pretty powerful visualization library ..

Yes! And I like the very nice overview of BI tools and visualizations
in this presentation:
http://squirelove.net/r-node-extra/r-node-plug-10-04-14.html

(from the R-node homepage: http://www.squirelove.net/r-node/doku.php)

k

>>
>> On Thu, May 27, 2010 at 5:10 PM, Jason Pickering
>> <jason.p.pickering@xxxxxxxxx> wrote:
>>> Hi Roger,
>>>
>>> Valid concerns, but I would assume that the typical use case for R
>>> would be that the user would typically only be looking at 1) a view
>>> that has been prepared for them or 2) be provided with read only
>>> access to selected tables. I would not expect that locks would be a
>>> problem in this case.
>>>
>>> However, your point is well taken. It is therefore I have been
>>> tinkering around with luciddb, a column-oriented database, that may be
>>> more appropriate for analysis. Which database that should be used is
>>> probably a discussion, but the point is that a separation between the
>>> "transactional" database that is being used for data entry, and the
>>> "analysis" database is a good idea. I would regard this to probably
>>> outside the scope of what DHIS is really intended to do. If people
>>> need to use tools like R, they will likely as well of being capable of
>>> coming up with their own solution.
>>>
>>> However, this does not exclude that certain simple examples could be
>>> built into DHIS. Obviously performance is a concern, but of course, it
>>> depends on what you are trying to do. R is incredibly powerful when it
>>> comes to producing graphics as I am sure that you are aware, and
>>> lightyears ahead of the other components we are using (jPlot I think).
>>> So, I would think that the typical use case would be to leverage R,
>>> possibly as an extension to DHIS2 for those that need it, for the
>>> generation of analysis tables and graphics, that would be beyond the
>>> scope of the "basic" package, which is really limited to aggregation.
>>>
>>> Anyway, just a few more thoughts.
>>>
>>> Regards,
>>> Jason
>>>
>>>
>>> On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP)
>>> (CTR) <rdf4@xxxxxxx> wrote:
>>>> My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use.  I'm already worried about simultaneous transactional and reporting use.  Have there been any large-volume performance tests?  Has any thought been given to splitting reporting and data entry between different DB servers?  I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?
>>>>
>>>> -----Original Message-----
>>>> From: dhis2-users-bounces+rdf4=cdc.gov@xxxxxxxxxxxxxxxxxxx [mailto:dhis2-users-bounces+rdf4=cdc.gov@xxxxxxxxxxxxxxxxxxx] On Behalf Of Jason Pickering
>>>> Sent: Thursday, May 27, 2010 7:03 AM
>>>> To: Bob Jolliffe
>>>> Cc: dhis2-users@xxxxxxxxxxxxxxxxxxx; dhis2-devs
>>>> Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R
>>>>
>>>> Yeah, this I guess comes back time and time again, with my some what
>>>> uncomfortable relationship with Hibernate and Java. Clearly, we need
>>>> to think about how to make certain procedures crossplatform compatible
>>>> (cross platform in the sense of working between Postgres/MySQL and
>>>> other DBs) with the need to offer advanced analysis capabilities, with
>>>> acceptable performance.
>>>>
>>>> There could be multiple ways of doing it, but in the absense of having
>>>> R integrated into DHIS2, I think the most likely shorterm use case
>>>> would be just some documentation on how to use the R client with the
>>>> DHIS2 database. Perhaps those users that use R over time with DHIS2
>>>> could contribute their procedures, which should be able to be
>>>> generalized either with PL/R.
>>>>
>>>> Of course the difference with using Postgres, is that R procedures can
>>>> be embedded as a new language inside the DB. I am not really sure this
>>>> is possible with MySQL. This of course reduces the internal overhead
>>>> of getting the data out of Postgres, through Java, and into the R
>>>> interpreter, but I am not sure really what the impact of this might be
>>>> without testing it.
>>>>
>>>>
>>>>
>>>> On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>>>> On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
>>>>>> Hi Bob,
>>>>>>
>>>>>> Yes, I suspect that most R users would probably want to do things
>>>>>> their own way. It has a rather steep learning curve. :)
>>>>>>
>>>>>> As for canned R scripts, the best way would probably with with PL/R, a
>>>>>> procedural Postgresql language which utilizes R.
>>>>>>
>>>>>> http://www.joeconway.com/plr/doc/index.html
>>>>>>
>>>>>> I have done some very basic testing and it seems to work just fine on
>>>>>> the server side.
>>>>>
>>>>> Swings and roundabouts to a certain extent.  The main thing is that
>>>>> the r scripts are evaluated using the r c library.  If they were
>>>>> invoked from within java/dhis then I guess data access would be slower
>>>>> than from pl/r (we'd need to have a way to get the data to the r
>>>>> interpreter), but number crunching would be similar and would also
>>>>> work with mysql and friends.  Not sure which of these are bigger
>>>>> problems in typical/possible scenarios.
>>>>>
>>>>>>
>>>>>> I think they are two separate problems really, but I totally agree, C
>>>>>> is likely going to be faster than Java for big operations. However, I
>>>>>> do think (as all of you know) that the use of stored procedures (with
>>>>>> the wrapper facade type of approach) for certain functions (like
>>>>>> aggregation and heavy cross tab operations) would be much better to be
>>>>>> executed on the database server as a native stored procedure.
>>>>>>
>>>>>> Regards,
>>>>>> Jason
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>>>>>> We've talked before about integrating scripting engine (such as R)
>>>>>>> into dhis : http://www.rforge.net/rscript/
>>>>>>>
>>>>>>> But my guess is that most R users are going to be of a level of
>>>>>>> sophistication that they would be most comfortable doing the kind of
>>>>>>> thing you describe - conecting directly to db with r client and doing
>>>>>>> their stuff.
>>>>>>>
>>>>>>> OTOH if there were sufficiently useful "canned" dhis R scripts which
>>>>>>> could take some number crunching load off the jvm and produce canned
>>>>>>> useful analysis then that would be different.
>>>>>>>
>>>>>>> Sadly I don't know sufficient about R to know.  But I sense it ...
>>>>>>>
>>>>>>> Regards
>>>>>>> Bob
>>>>>>>
>>>>>>> On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
>>>>>>>> Hi everyone. I have had a recent question from a user about how DHIS2
>>>>>>>> can be used with R. I am including a trivial example here about how to
>>>>>>>> use R as as a client to access data and produce a graph in DHIS2.
>>>>>>>>
>>>>>>>> Just get a copy of R  and install the DBI and RPostregSQL packages with
>>>>>>>>
>>>>>>>>>install.packages()
>>>>>>>>
>>>>>>>>
>>>>>>>> After that, just connect to the DB, retrieve your data (in this case
>>>>>>>> from a report table) and produce a graph.
>>>>>>>>
>>>>>>>>>library(DBI)
>>>>>>>>
>>>>>>>>>library(RPostgreSQL)
>>>>>>>>
>>>>>>>>>drv <- dbDriver("PostgreSQL")
>>>>>>>>
>>>>>>>>>con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")
>>>>>>>>
>>>>>>>>>rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where
>>>>>>>> organisationunitid = 3904")
>>>>>>>>
>>>>>>>>>data <- fetch(rs,n=-1)
>>>>>>>>
>>>>>>>>>barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)
>>>>>>>>
>>>>>>>>>dev.print(png, file="/home/jason/test.png")
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Jason
>>>>>>>>
>>>>>>>> ---
>>>>>>>> Jason P. Pickering
>>>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>>>> tel:+260968395190
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --
>>>>>> Jason P. Pickering
>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>> tel:+260968395190
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Jason P. Pickering
>>>> email: jason.p.pickering@xxxxxxxxx
>>>> tel:+260968395190
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~dhis2-users
>>>> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>
>>>
>>>
>>> --
>>> --
>>> Jason P. Pickering
>>> email: jason.p.pickering@xxxxxxxxx
>>> tel:+260968395190
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~dhis2-devs
>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>>
>> --
>> Cheers,
>> Knut Staring
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>



-- 
Cheers,
Knut Staring



References