← Back to team overview

dhis2-devs team mailing list archive

Re: Pivot tables

 

2010/10/25 Bob Jolliffe <bobjolliffe@xxxxxxxxx>

> 2010/10/25 Lars Helge Øverland <larshelge@xxxxxxxxx>:
> >
> > Hi thanks for the nice update, interesting. The thinking sounds very
> > reasonable to me.
> > Ola and I will work a bit on the optimization of SQL view. One issue here
> is
> > that when writing "generic" sql queries for tables like
> > orgunitgroupsetstructure, one needs to select * (all) columns in order to
> be
> > portable because the table has each column per groupset, ie. its dynamic.
> > But I guess we can fix these things when generating the SQL from Java /
> > DHIS.
>
> Yes. The more we optimize these (in terms of number of fields) the
> less enormous the excel file will be.  And given that excel holds all
> in memory we must do the best we can.
>
> > Re size, a good thing is that a typical pivot table won't contain the
> whole
> > database, rather the data for a district and its facilities. We want to
> be
> > able to compare with other districts (maybe inside the same province
> only)
> > but for those we only need the aggregated data at the district level (not
> > facility data) - which means a lot less data.
>
> I'm trying to focus on the particular problems related to enormous
> spreadsheets.  If they are small we don't really have much of a
> problem either way.  But if we can deal with the really large ones
> then we can maybe deal with medium large ones quite snappily.
>
>
Yes definitely.


> > Its also interesting to see how excel uses a "minimal" xml (<r><x
> > v="0"/></r> etc). We could consider that for DXF 2 datavalues.
> > Size/performance vs readability.
>
> This is an OOXML "feature" which has been heavily criticized by almost
> all in the xml world.  Not being religious, I am in two minds about
> it.  I think it might be reasonable for the datavalues in dxf, but I'd
> stick my heels in if we started doing the same thing for metadata.  In
> fact I think there is just such a minimal syntax which I have
> suggested to Jo some time back as being maybe useful for a lightweight
> mobile version of sdmx.  I wouldn't have a problem with:
>
> <dv de='32' o='3' p='4' d=''2' />  <!-- where 'd' = disaggregation -->
>
> (I'd still like to see the other attributes, comment, createdby etc
> being aggregated up to a 'datavalueset' but that's another story)
>
> Regarding ooxml I think there is a better justification doing it for a
> data oriented document format like a spreadsheet, than there is for
> word docs and presentations.  Certainly M$ has achieved significant
> performance benefits to Excel in terms of load times.  File size is
> not that significant a benefit because the compressed files shouldn't
> be that different.  What also really helps is the use of
> sharedstrings.  You can see that they do a similar shared strings type
> optimization in the pivotcache.  All of which allows us to stretch the
> use of spreadsheets into domains where they shouldn't really go.
>
>
Yes, was only thinking about data values. We can leave this for later.

Follow ups

References