← Back to team overview

dhis2-users team mailing list archive

Re: Mydatamart issue?

 

Hi Andrew,

Do you know how many rows that is returned from your pivot source query?

The mydatmart tool is designed for what Jason describes in option 1) in his
email.

Also note that we require Excel 2007 or newer for pivot tables to work
properly with mydatamart (on metatdata updates and re-linking of mydatamart
files and excel).

The user selects its orgunit and then the analysis level (the lowest level
of the orgunit hierarchy needed for analysis). Typically you will not need
to look at facility level for general data analysis at the national level,
and in stead you can probably pick district level.

At the districts the users can still select facility as the analysis level
and download facility level data to their local mydatamart files and load
that into Excel.

To facilitate this difference in analysis level in mydatamart/Excel you
typically will need to use at least two different Excel files (pivot table
template files), one for national level users with pivot tables that pull
district level data (e.g. the pivotsource_routinedata_ou2_m query), and
another excel file for district users with facility level data (using
pivotsource_routinedata_ou4_m - where facility level = level 4).

Note that Excel will automatically aggregate the data for you in the pivot
tables, so you can easily look at  district totals also in the pivot table
with facility level data. The difference is how deep you can drill down,
what the lowest level of analysis is; the district-level pivot table will
never be able to show facility data.

When you have data for more than 1 year you can also consider breaking up
the pivot table by either calendar or financial year by inserting custom
sql queries in  your mydatamart file that filter data rows on the periods.

In my experience the pivot tables start to get really slow when you get
above 7-800 000 rows, but that of course also depends on your hardware.

I have promised to write up a section on setting up pivot tables with
mydatamart to the user documentation, so look out for commits to the
documentation branch in the next few weeks if you're interested in more
details on this topic.

And. to make your life a lot easier, I would consider getting rid of most
of those zero values Lars just referred to.... you probably only require
that for a few selected data elements.

Ola
-----










----------------------------------
Ola Hodne Titlestad (Mr)
HISP
Department of Informatics
University of Oslo

Mobile: +47 48069736
Home address: Vetlandsvn. 95B, 0685 Oslo, Norway. Googlemaps
link<http://maps.google.com/maps?f=q&source=s_q&hl=en&geocode=&q=Vetlandsvn.+95B,+0685+Oslo,+Norway>


On 7 May 2012 09:32, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:

> Hi Andrew,
> I do not know all of the circumstances of your data, but in general,
> Excel is limited by a certain number of rows and columns. I think
> Excel 2003 is limited to around 65,000 rows and Excel 2007 to over 1
> million rows. PivotTables are limited by the amount of available
> memory on your machine. If your dataset has more than this, you can do
> a few things.
>
> 1) Use a higher level of aggregation. Instead of pulling data at the
> facility level for the entire country, use the district level (or
> something higher) this will decrease the number of rows which must be
> pulled into Excel. Similarly, looking at quarterly data, versus
> monthly data will decrease the amount of data which needs to be
> retrieved by Excel.
>
> 2) Filter the rows which are pulled from MyDatamart with some type of
> filter in the pivot source query. This normally is done using a filter
> on either the _dataelementgroupsetstructure or
> _indicatorgroupsetstructure table of the MyDatamart database. This
> will allow you to for instance, pull only PMTCT indicators into an
> Excel sheet, without pulling unrelated indicators, and then having to
> filter them in the Excel table. You will need to modify and install
> some new SQL views in MyDatamart to use this approach.
>
> 3) Consider using other tools, such as R/SPSS/Stata for analysis which
> do not have such limitations.
>
> We regularly analyze rather largish datasets (for Excel anyway) with
> several million rows of data, using these approaches. The best way
> however is just to bring the data which you need for a specific
> PivotTable, without returning everything with may not be relevant for
> a particular analysis.
>
> Best regards,
> Jason
>
>
>
> On Mon, May 7, 2012 at 9:16 AM, Muhire Andrew <muhireandrew@xxxxxxxxx>
> wrote:
> > Hi to all,
> >
> > In Rwanda we have 4 months data in our DHIS2 database, And downloading
> data
> > for the whole 4 months, Excel gives an overloading error or can even
> > burst.......................Any advice on using mydatamart?   how are you
> > running it over there? ......This issue happens when we download
> aggregated
> > data from facility level................I would suggest to have a
> selection
> > box in mydatamart where a user can select only the data elements or
> > Indicators to be analysed insteady of having the whole staff.
> >
> > In Rwanda, users are used on row data from MYDATAMART,now 4 months its
> > impossible to download to excel at the central level!!!! what will
> happen in
> > upcoming months?
> >
> >
> >
> >
> >
> >
> > ________________________________
> >
> >
> > Muhire Andrew
> > HMIS/Ministry of Health
> > First say to yourself what you would be; and then do what you have to
> do..
> >
> >
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~dhis2-users
> > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~dhis2-users
> > More help   : https://help.launchpad.net/ListHelp
> >
>

Follow ups

References