dhis2-devs team mailing list archive

Thread
Date

Re: Pivot tables

To: Ola Hodne Titlestad <olati@xxxxxxxxxx>
From: Bob Jolliffe <bobjolliffe@xxxxxxxxx>
Date: Wed, 27 Oct 2010 22:58:56 +0100
Cc: DHIS 2 developers <dhis2-devs@xxxxxxxxxxxxxxxxxxx>, Ola Hodne Titlestad <olati@xxxxxxxxxxxxxxxxxxxxx>
In-reply-to: <AANLkTi=Y9zH+wFBRuv1Nm0=X4iyyENpPd=DV8JCr2MLq@mail.gmail.com>

2010/10/27 Ola Hodne Titlestad <olati@xxxxxxxxxx>:
> Yes, I casted the Value field to number using the val() function,
> val(Value). I  modified the Query used by the pivot table in:
> Pivot Tables top menu->Change Data Source->Connection
> Properties->Definition->Command text

I'm a bit confused.  The 'command" text in my connections is just the
csv file name.


> Not sure where this query is stored as these changes did not show in the
> .odc after saving the Excel file.
> There is probably connection information also in Excel somewhere, since the
> query is on top of the .odc.
> If possible, it sounds like a better deal to modify this in the .odc file
> and just make sure Excel gets the data in the correct format.
> ----------------------------------
> Ola Hodne Titlestad (Mr)
> HISP
> Department of Informatics
> University of Oslo
>
> Mobile: +47 48069736
> Home address: Vetlandsvn. 95B, 0685 Oslo, Norway. Googlemaps link
>
>
> 2010/10/27 Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>
>> 2010/10/27 Lars Helge Øverland <larshelge@xxxxxxxxx>:
>> >
>> >
>> > 2010/10/27 Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>> >>
>> >> Attached is a simple skeleton overview diagram of how it *can* be
>> >> done.  I'm not sure if I'm 100% comfortable with the approach yet but
>> >> I can verify that it works (caveat below).
>> >>
>> >> From the diagram, the point ultimately is to populate the
>> >> pivottablecache of the pivottable within the local excel spreadsheet
>> >> with the data from the view in the remote database.  This is what you
>> >> would normally do (in a connected setting) by setting an odbc type
>> >> connection to the postgres database.
>> >>
>> >> What I have done is instead to create a connection to a local csv
>> >> file.  This is done by creating a odc connection description file and
>> >> pointing the pivottable at that.  Selected nippet from example odc
>> >> connection file:
>> >> <xml id='msodc'>
>> >>  <odc:OfficeDataConnection
>> >>  xmlns:odc="urn:schemas-microsoft-com:office:odc"
>> >>  xmlns="http://www.w3.org/TR/REC-html40";>
>> >>  <odc:Connection odc:Type="OLEDB">
>> >>   <odc:ConnectionString>Provider=MSDASQL.1;Persist Security
>> >> Info=False;Data Source=csv;Extended
>> >>
>> >>
>> >> Properties=&quot;DSN=csv;DefaultDir=C:\Users\bobj\Documents\excel;DriverId=27;FIL=text;MaxBufferSize=2048;PageTimeout=5;&quot;;Initial
>> >> Catalog=C:\Users\bobj\Documents\excel</odc:ConnectionString>
>> >>   <odc:CommandType>Table</odc:CommandType>
>> >>
>> >>
>> >> <odc:CommandText>`C:\Users\bobj\Documents\excel\test.csv`</odc:CommandText>
>> >>  </odc:Connection>
>> >>  </odc:OfficeDataConnection>
>> >> </xml>
>> >>
>> >> I've taken a 30MB csv file dumped from the
>> >> _VIEW_PIVOTSOURCEROUTINEDATAOU3ALL view of Ola's sample data.
>> >> (Grabbed from h2 with: CALL CSVWRITE('test.csv', 'SELECT * FROM
>> >> _VIEW_PIVOTSOURCEROUTINEDATAOU3ALL');  )
>> >>
>> >> Using the odc above, refreshing the pivottable causes a copy from the
>> >> csv file into the pivot cache which seems pretty quick (4-5 seconds) -
>> >> certainly quicker than pulling it via tcp-odbc-jdbc or whatever.
>> >>
>> >> Note that worksheet can have more than one pivottable and more than
>> >> one pivotcache and that more than one pivottable can share the same
>> >> pivot cache.  I haven't gone into those details.
>> >>
>> >> So the upside is local pivottable refresh is quick and appending
>> >> monthly csv snippets is trivial.  Another positive spinoff in havng
>> >> the datset as a local csv is that you can also pivot with all kinds of
>> >> other tools other than excel (ranging from primitive awk to fancy 'r'
>> >> reshaping - both of which I think I prefer to excel :-) ).  But
>> >> managing this setup is fairly straightforward - 3 files:  the excel
>> >> file, the odc and the csv files which grow with monthly updates (I
>> >> know there must also be a facility to amend changed data but we'll
>> >> cross that bridge ..) .   And there are of course downsides.
>> >>
>> >> These csv files are really pretty big and hugely sub optimal.  The
>> >> normalization of the database has been flattened out for convenience
>> >> of pivoting so we are dealing with tens of thousands of repeated
>> >> strings (datelement names, ou names etc).  From a transport
>> >> perspective this is maybe not too bad - I'm guessing a reasonable
>> >> compression might factor out some of this.   And appending monthly
>> >> chunks wouldn't be too bad.  But they are pretty big files to manage.
>> >>  But I think they are maybe more robust than depending solely on the
>> >> pivotcache as a local primary store.
>> >>
>> >> That's it for now.  Thought to sleep on is whether we can't somehow
>> >> just maintain a local database which replicates the necessary tables
>> >> from the online server.  But its complicated.  Maintaining a couple
>> >> (hopefully not too many) csv files might be easier.
>> >
>> >
>> > Nice. Explicitly separating presentation from data with csv sounds more
>> > manageable. I think we can live with "yearly" pivot tables, in that
>> > these
>> > things won't keep growing forever.
>> >
>> >>
>> >> Bob
>> >>
>> >> PS.  The caveat.  My csv source for pivottables sort of works but all
>> >> my datavalues are being treated as zero :-(  I think this is probably
>> >> to do with setting the datatype of the field somehow.  I think its
>> >> fixable.
>> >
>> > Might be related to the fact that DataValue.value is a String. Maybe Ola
>> > has
>> > some experience?
>>
>> Yes it has to do with interpreting fields as numeric rather than
>> string.  csv driver defaults to string assumption.  Haven't figured
>> the best place to tell it otherwise yet.  But I think Ola has ..
>>
>> >
>
>

References

Pivot tables
From: Bob Jolliffe, 2010-10-22
Re: Pivot tables
From: Lars Helge Øverland, 2010-10-22
Re: Pivot tables
From: Ola Hodne Titlestad, 2010-10-22
Re: Pivot tables
From: Bob Jolliffe, 2010-10-22
Re: Pivot tables
From: Lars Helge Øverland, 2010-10-22
Re: Pivot tables
From: Bob Jolliffe, 2010-10-24
Re: Pivot tables
From: Lars Helge Øverland, 2010-10-25
Re: Pivot tables
From: Bob Jolliffe, 2010-10-25
Re: Pivot tables
From: Lars Helge Øverland, 2010-10-26
Re: Pivot tables
From: Bob Jolliffe, 2010-10-26
Re: Pivot tables
From: Lars Helge Øverland, 2010-10-27
Re: Pivot tables
From: Bob Jolliffe, 2010-10-27
Re: Pivot tables
From: Ola Hodne Titlestad, 2010-10-27