← Back to team overview

dhis2-devs team mailing list archive

Re: Fwd: data pilot

 

Hi,

Sorry for the late feedback on this.

I've done some testing of datapilot using the new openoffice 3.2 RC1 release
on Windows XP with java 1.6.16:

*Large datasets are problematic
*The main issue I have encountered is the lack of support for large
datasets. In DHIS we typically have data sets of more than 100 000 rows,
sometimes more than a million rows. With openoffice I ended up having lots
of java heap space errors with the larger datasets when loading the data
into the datapilot (using jdbc connection to postgres) and did not manage to
create them at all. Datasets with up to 20 000 rows seem to work ok,
although a bit slower than what I am used to with Excel, and the 40 000 rows
dataset I managed to set up is extremely slow when doing any manipulation. I
can see that the wiki page on performance improvement on larger datasets
that Bob linked to refers to a test dataset of 5000 rows, which is a very
small dataset in the context of DHIS.

For comparison I created Excel pivot tables for the 40 000 row IndicatorsOU3
table and a 300 000 row pivot for RoutineDataOU3 and these work just fine
and much faster than the datapilots. This is especially noticeable when
editing the pivot table layouts (start in data pilot and pivot table wizard
in excel), but also when doing simpler manipulation of the tables such as
moving a field from row to column or changing a filter.

*Indicator values as calculated fields are not supported?
*Another issue I have come across is the lack of support to set up what in
Excel is called calculated fields,formulas based on other pivot fields. This
is needed when dealing with indicators as the indicator value field is a
calulated field set up as the formula numxfactor/denominatorvalue. Such a
formula is needed when dealing with aggregation of percentages as numerators
and denominators need to be summed up separately and not by simply averaging
the percentage (the value) of all the orgunit children. It is possible that
this is supported, but I haven't found it yet.

*Captions for pivot fields are not supported?
*I cannot find a place to change the caption of the pivot fields. We often
use generic column names in the database such as orgunit1, orgunit2, but in
a pivot table it looks much better if these names are replaced with more
localised names e.g. country, district, chiefdom in the case of Sierra
Leone.

*General comments:
*- Openoffice does not seem to store the data inside the spreadsheet file
like Excel does and therefore will always need a functioning database
connection. Excel only needs the database connection on refresh.

- You need to create a new openoffice database using a jdbc connection to
the existing postgres database.

- To connect to the postgres database using jdbc you need to add the
postgres jdbc driver to the java classpath in openoffice:
Tools->Options->Java->Classpath

- This openoffice database needs to be registered in openoffice in order to
be available to the datapilot. This is an option during the setup of a new
database, or you can do it later by opening Calc and press F4, then right
click inside the Bibliography navigator window and select Registered
Databases. Then click New and look up your openoffice database.

- When creating the datapilot the pivot view queries in the dhis database
should be treated as sheets (default) and not queries
(all views in postgres are automatically displayed as tables in the
openoffice database)

Here are the files I have used:
http://folk.uio.no/olati/filer/dhis2_sl_data_pilots.ods
http://folk.uio.no/olati/filer/dhis2_sl_pivots.zip
http://folk.uio.no/olati/filer/dhis2_sl.backup
http://208.76.222.114/confluence/download/attachments/8096/PivotSourceViewsOU2-5.sql

- the excel file can be used right away, but a refresh will needs an odbc
connection 'dhis2_sl' to a database 'dhis2_sl' with the pivotsource views
- the openoffice file will need a database dhis2_sl registered in openoffice
with the pivotsource views

Ola
----------

2009/12/11 Knut Staring <knutst@xxxxxxxxx>

> Interetesting .... with a release candidate coming in just one week:
> 2009-12-17
>
>
> On Fri, Dec 11, 2009 at 5:57 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx>wrote:
>
>> Looks like 3.2 is the "performance" targetted version.  [
>> http://wiki.services.openoffice.org/wiki/Features]  Including DataPilot.
>>
>>
>> 2009/12/11 Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>
>>> Sure.  This is just a background issue.  The OOo pivot table performance
>>> has been a mess for some time now.  Just want to take advantage of the
>>> opportunity to feed them some real requirements.
>>>
>>> Mind you I see IBM's Chinese dev team have been putting some work into
>>> this already (
>>> http://wiki.services.openoffice.org/wiki/Calc/Proposal_DataPilot_byIBM).
>>> I wonder did this optimisation make itself into OOo yet?  I guess we'll find
>>> out soon.
>>>
>>> Bob.
>>>
>>> 2009/12/11 Ola Hodne Titlestad <olatitle@xxxxxxxxx>
>>>
>>> Hi Bob,
>>>>
>>>> Can give you a data pilot (for the latest version of OO) with lots of
>>>> data by Monday.
>>>> Would that be soon enough?
>>>>
>>>> Ola
>>>> _______
>>>>
>>>> 2009/12/11 Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>>>
>>>>> Hi all
>>>>>
>>>>> I am in touch with Sun engineers on OpenOffice Calc regarding datapilot
>>>>> performance.  I understand there have been performance issues with pivot
>>>>> tables - in fact it is a well known openoffice problem.  Can anyone send me
>>>>> a reasonably substantive excel worksheet with pivot tables which I can send
>>>>> to Sun to see if we can't motivate improvement of openoffice calc in this
>>>>> area.
>>>>>
>>>>> Regards
>>>>> Bob
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>>>> Date: 2009/12/11
>>>>> Subject: Re: data pilot
>>>>> To: ASLAM RAFFEE <Aslam.Raffee@xxxxxxx>
>>>>> Cc: Jørn Braa <jornbraa@xxxxxxxxx>, Stephan Schaefer <
>>>>> Stephan.Schaefer@xxxxxxx>, Michael Brauer - Sun Germany - ham02 -
>>>>> Hamburg <Michael.Brauer@xxxxxxx>
>>>>>
>>>>>
>>>>> Sure.  Thanks.  I will try and get hold a realistic sample from the
>>>>> field rather than a concoction.
>>>>>
>>>>> Regards
>>>>> Bob
>>>>>
>>>>> 2009/12/11 ASLAM RAFFEE <Aslam.Raffee@xxxxxxx>
>>>>>
>>>>>> Bob,
>>>>>>
>>>>>>
>>>>>> Could you provide Stephan the information needed so that he can
>>>>>> assist.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Aslam Raffee,
>>>>>> Government Strategy and Global Communities, EMEA
>>>>>> Sun Microsystems, Inc.
>>>>>> Waterfall Edge, Waterfall Park, Bekker Street
>>>>>> Midrand 1685 South Africa
>>>>>> Phone +27 11 256-6360
>>>>>> Mobile +27 82 312 6782
>>>>>> Email aslam.raffee@xxxxxxx
>>>>>> http://www.sun.com
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11 Dec 2009, at 2:42 PM, Michael Brauer - Sun Germany - ham02 -
>>>>>> Hamburg wrote:
>>>>>>
>>>>>> > Hi Aslam,
>>>>>> >
>>>>>> > I have set Stephan Schäfer on copy who leads our OOo Calc team.
>>>>>> >
>>>>>> > Of cause we may have a look at the problem. But we need to know the
>>>>>> version of OOo that is in use and one ore more sample documents that don't
>>>>>> work as expected.
>>>>>> >
>>>>>> > Best regards
>>>>>> >
>>>>>> > Michael
>>>>>> >
>>>>>> > Am 11.12.09 10:44, ASLAM RAFFEE schrieb:
>>>>>> >> Dear Michael,
>>>>>> >> I am in discussions with Bob Jolliffe of the Health Information
>>>>>> Systems Project (http://www.hisp.org/) on how Sun can add value to
>>>>>> this project. One of the issues that has come up is that the project would
>>>>>> like to use open office and ODF. Due to Data pilot performance issues (keeps
>>>>>> crashing) they are forced to use excel pivot tables. Are we able to look
>>>>>> into this problem and assist with the data pilot issues.
>>>>>> >> Regards,
>>>>>> >> Aslam Raffee,
>>>>>> >> Government Strategy and Global Communities, EMEA
>>>>>> >> Sun Microsystems, Inc.
>>>>>> >> Waterfall Edge, Waterfall Park, Bekker Street
>>>>>> >> Midrand 1685 South Africa
>>>>>> >> Phone +27 11 256-6360
>>>>>> >> Mobile +27 82 312 6782
>>>>>> >> Email aslam.raffee@xxxxxxx
>>>>>> >> http://www.sun.com
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Michael Brauer, Technical Architect Software Engineering
>>>>>> > StarOffice/OpenOffice.org
>>>>>> > Sun Microsystems GmbH             Nagelsweg 55
>>>>>> > D-20097 Hamburg, Germany          michael.brauer@xxxxxxx
>>>>>> > http://sun.com/staroffice         +49 40 23646 500
>>>>>> > http://blogs.sun.com/GullFOSS
>>>>>> >
>>>>>> > Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1,
>>>>>> >          D-85551 Kirchheim-Heimstetten
>>>>>> > Amtsgericht Muenchen: HRB 161028
>>>>>> > Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
>>>>>> > Vorsitzender des Aufsichtsrates: Martin Haering
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Cheers,
> Knut Staring
>

Follow ups

References