dhis2-devs team mailing list archive
-
dhis2-devs team
-
Mailing list archive
-
Message #06149
Re: [Dhis2-users] DHIS2 with R
DHIS already has the concept of datamart and report tables, which do
provide some separation of the transactional from the analytical,
though we also have plans for improving this.
R-node with Jquery /GeoExt looks interesting:
http://www.r-statistics.com/2010/04/r-node-a-web-front-end-to-r-with-protovis/
On Thu, May 27, 2010 at 5:10 PM, Jason Pickering
<jason.p.pickering@xxxxxxxxx> wrote:
> Hi Roger,
>
> Valid concerns, but I would assume that the typical use case for R
> would be that the user would typically only be looking at 1) a view
> that has been prepared for them or 2) be provided with read only
> access to selected tables. I would not expect that locks would be a
> problem in this case.
>
> However, your point is well taken. It is therefore I have been
> tinkering around with luciddb, a column-oriented database, that may be
> more appropriate for analysis. Which database that should be used is
> probably a discussion, but the point is that a separation between the
> "transactional" database that is being used for data entry, and the
> "analysis" database is a good idea. I would regard this to probably
> outside the scope of what DHIS is really intended to do. If people
> need to use tools like R, they will likely as well of being capable of
> coming up with their own solution.
>
> However, this does not exclude that certain simple examples could be
> built into DHIS. Obviously performance is a concern, but of course, it
> depends on what you are trying to do. R is incredibly powerful when it
> comes to producing graphics as I am sure that you are aware, and
> lightyears ahead of the other components we are using (jPlot I think).
> So, I would think that the typical use case would be to leverage R,
> possibly as an extension to DHIS2 for those that need it, for the
> generation of analysis tables and graphics, that would be beyond the
> scope of the "basic" package, which is really limited to aggregation.
>
> Anyway, just a few more thoughts.
>
> Regards,
> Jason
>
>
> On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP)
> (CTR) <rdf4@xxxxxxx> wrote:
>> My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use. I'm already worried about simultaneous transactional and reporting use. Have there been any large-volume performance tests? Has any thought been given to splitting reporting and data entry between different DB servers? I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?
>>
>> -----Original Message-----
>> From: dhis2-users-bounces+rdf4=cdc.gov@xxxxxxxxxxxxxxxxxxx [mailto:dhis2-users-bounces+rdf4=cdc.gov@xxxxxxxxxxxxxxxxxxx] On Behalf Of Jason Pickering
>> Sent: Thursday, May 27, 2010 7:03 AM
>> To: Bob Jolliffe
>> Cc: dhis2-users@xxxxxxxxxxxxxxxxxxx; dhis2-devs
>> Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R
>>
>> Yeah, this I guess comes back time and time again, with my some what
>> uncomfortable relationship with Hibernate and Java. Clearly, we need
>> to think about how to make certain procedures crossplatform compatible
>> (cross platform in the sense of working between Postgres/MySQL and
>> other DBs) with the need to offer advanced analysis capabilities, with
>> acceptable performance.
>>
>> There could be multiple ways of doing it, but in the absense of having
>> R integrated into DHIS2, I think the most likely shorterm use case
>> would be just some documentation on how to use the R client with the
>> DHIS2 database. Perhaps those users that use R over time with DHIS2
>> could contribute their procedures, which should be able to be
>> generalized either with PL/R.
>>
>> Of course the difference with using Postgres, is that R procedures can
>> be embedded as a new language inside the DB. I am not really sure this
>> is possible with MySQL. This of course reduces the internal overhead
>> of getting the data out of Postgres, through Java, and into the R
>> interpreter, but I am not sure really what the impact of this might be
>> without testing it.
>>
>>
>>
>> On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>> On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
>>>> Hi Bob,
>>>>
>>>> Yes, I suspect that most R users would probably want to do things
>>>> their own way. It has a rather steep learning curve. :)
>>>>
>>>> As for canned R scripts, the best way would probably with with PL/R, a
>>>> procedural Postgresql language which utilizes R.
>>>>
>>>> http://www.joeconway.com/plr/doc/index.html
>>>>
>>>> I have done some very basic testing and it seems to work just fine on
>>>> the server side.
>>>
>>> Swings and roundabouts to a certain extent. The main thing is that
>>> the r scripts are evaluated using the r c library. If they were
>>> invoked from within java/dhis then I guess data access would be slower
>>> than from pl/r (we'd need to have a way to get the data to the r
>>> interpreter), but number crunching would be similar and would also
>>> work with mysql and friends. Not sure which of these are bigger
>>> problems in typical/possible scenarios.
>>>
>>>>
>>>> I think they are two separate problems really, but I totally agree, C
>>>> is likely going to be faster than Java for big operations. However, I
>>>> do think (as all of you know) that the use of stored procedures (with
>>>> the wrapper facade type of approach) for certain functions (like
>>>> aggregation and heavy cross tab operations) would be much better to be
>>>> executed on the database server as a native stored procedure.
>>>>
>>>> Regards,
>>>> Jason
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>>>> We've talked before about integrating scripting engine (such as R)
>>>>> into dhis : http://www.rforge.net/rscript/
>>>>>
>>>>> But my guess is that most R users are going to be of a level of
>>>>> sophistication that they would be most comfortable doing the kind of
>>>>> thing you describe - conecting directly to db with r client and doing
>>>>> their stuff.
>>>>>
>>>>> OTOH if there were sufficiently useful "canned" dhis R scripts which
>>>>> could take some number crunching load off the jvm and produce canned
>>>>> useful analysis then that would be different.
>>>>>
>>>>> Sadly I don't know sufficient about R to know. But I sense it ...
>>>>>
>>>>> Regards
>>>>> Bob
>>>>>
>>>>> On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
>>>>>> Hi everyone. I have had a recent question from a user about how DHIS2
>>>>>> can be used with R. I am including a trivial example here about how to
>>>>>> use R as as a client to access data and produce a graph in DHIS2.
>>>>>>
>>>>>> Just get a copy of R and install the DBI and RPostregSQL packages with
>>>>>>
>>>>>>>install.packages()
>>>>>>
>>>>>>
>>>>>> After that, just connect to the DB, retrieve your data (in this case
>>>>>> from a report table) and produce a graph.
>>>>>>
>>>>>>>library(DBI)
>>>>>>
>>>>>>>library(RPostgreSQL)
>>>>>>
>>>>>>>drv <- dbDriver("PostgreSQL")
>>>>>>
>>>>>>>con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")
>>>>>>
>>>>>>>rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where
>>>>>> organisationunitid = 3904")
>>>>>>
>>>>>>>data <- fetch(rs,n=-1)
>>>>>>
>>>>>>>barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)
>>>>>>
>>>>>>>dev.print(png, file="/home/jason/test.png")
>>>>>>
>>>>>> Regards,
>>>>>> Jason
>>>>>>
>>>>>> ---
>>>>>> Jason P. Pickering
>>>>>> email: jason.p.pickering@xxxxxxxxx
>>>>>> tel:+260968395190
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: https://launchpad.net/~dhis2-devs
>>>>>> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Jason P. Pickering
>>>> email: jason.p.pickering@xxxxxxxxx
>>>> tel:+260968395190
>>>>
>>>
>>
>>
>>
>> --
>> --
>> Jason P. Pickering
>> email: jason.p.pickering@xxxxxxxxx
>> tel:+260968395190
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to : dhis2-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help : https://help.launchpad.net/ListHelp
>>
>
>
>
> --
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+260968395190
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help : https://help.launchpad.net/ListHelp
>
--
Cheers,
Knut Staring
Follow ups
References