← Back to team overview

dhis2-devs team mailing list archive

Re: Export failed with cannot instantiate Source (was: Re: Re: Pivot tables)

 

This SQL will do the trick:

delete from source where sourceid not in ( select organisationunitid from
organisationunit );

On Fri, Oct 22, 2010 at 3:00 PM, Ola Hodne Titlestad <olati@xxxxxxxxxx>wrote:

> Damiola,
>
> Please use a different subject when your email it is not related to the
> thread.
> That makes it easier for all of us to read emails and browse the list, and
> to respond to requests.
>
> Just compose a new email to dhis2-devs@xxxxxxxxxxxxxxxxxxx and it will go
> to the list.
> I would also recommend using the Bug report on launchpad when reporting
> bugs.
>
> Regarding your exception it seems the orgunit and source tables are out of
> synch which might happen if you manually delete or create orgunits directly
> in the database (which is not recommended). Please make sure that the all
> the IDs in Source and Organisationunit are exactly the same and try again.
> If it still doesn't work, it is something wrong and we need to create a bug
> report.
>
>
> ----------------------------------
> Ola Hodne Titlestad (Mr)
> HISP
> Department of Informatics
> University of Oslo
>
> Mobile: +47 48069736
> Home address: Vetlandsvn. 95B, 0685 Oslo, Norway. Googlemaps link<http://maps.google.com/maps?f=q&source=s_q&hl=en&geocode=&q=Vetlandsvn.+95B,+0685+Oslo,+Norway>
>
>
> 2010/10/22 <damilola@xxxxxxxxxxxxxxxxxxx>
>
>> Hi Bob,
>> Export (ALL) failed with the error below.  This began on upgrade from
>> previous to dhis2.0.5.
>>
>> org.hibernate.InstantiationException: Cannot instantiate abstract class or
>> interface: org.hisp.dhis.source.Source
>>     at
>> org.hibernate.tuple.PojoInstantiator.instantiate(PojoInstantiator.java:101)
>>     at
>> org.hibernate.tuple.PojoInstantiator.instantiate(PojoInstantiator.java:123)
>>     at
>> org.hibernate.tuple.entity.AbstractEntityTuplizer.instantiate(AbstractEntityTuplizer.java:606)
>>     at
>> org.hibernate.persister.entity.AbstractEntityPersister.instantiate(AbstractEntityPersister.java:3888)
>>     at org.hibernate.impl.SessionImpl.instantiate(SessionImpl.java:1412)
>>     at org.hibernate.impl.SessionImpl.instantiate(SessionImpl.java:1401)
>>     at org.hibernate.loader.Loader.instanceNotYetLoaded(Loader.java:1342)
>>     at org.hibernate.loader.Loader.getRow(Loader.java:1251)
>>     at org.hibernate.loader.Loader.getRowFromResultSet(Loader.java:619)
>>     at org.hibernate.loader.Loader.doQuery(Loader.java:745)
>>     at
>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:270)
>>     at org.hibernate.loader.Loader.doList(Loader.java:2294)
>>     at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2172)
>>     at org.hibernate.loader.Loader.list(Loader.java:2167)
>>     at
>> org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:119)
>>     at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1706)
>>     at org.hibernate.impl.CriteriaImpl.list(CriteriaImpl.java:347)
>>     at
>> org.hisp.dhis.source.hibernate.HibernateSourceStore.getAllSources(HibernateSourceStore.java:98)
>>     at
>> org.hisp.dhis.organisationunit.DefaultOrganisationUnitService.getAllOrganisationUnits(DefaultOrganisationUnitService.java:164)
>>     at
>> org.hisp.dhis.organisationunit.DefaultOrganisationUnitService.getOrganisationUnits(DefaultOrganisationUnitService.java:169)
>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>     at java.lang.reflect.Method.invoke(Unknown Source)
>>     at
>> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
>>     at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>>     at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>>     at
>> org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
>>     at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>>     at
>> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
>>     at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>>     at
>> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
>>     at $Proxy12.getOrganisationUnits(Unknown Source)
>>     at
>> org.hisp.dhis.importexport.dxf.converter.OrganisationUnitConverter.write(OrganisationUnitConverter.java:115)
>>     at
>> org.hisp.dhis.importexport.ExportPipeThread.doRun(ExportPipeThread.java:164)
>>     at
>> org.hisp.dhis.system.process.OpenSessionThread.run(OpenSessionThread.java:58)
>> Exception in thread "Thread-21" java.lang.RuntimeException: Writing
>> failedorg.hibernate.InstantiationException: Cannot instantiate abstract
>> class or interface: org.hisp.dhis.source.Source
>>     at
>> org.hisp.dhis.importexport.ExportPipeThread.doRun(ExportPipeThread.java:188)
>>     at
>> org.hisp.dhis.system.process.OpenSessionThread.run(OpenSessionThread.java:58)
>>
>>
>>
>> On Fri 22/10/10 1:10 PM , Ola Hodne Titlestad olati@xxxxxxxxxx sent:
>>
>> 2010/10/22 Lars Helge Øverland <larshelge@xxxxxxxxx>
>>
>>>
>>> Hi Bob
>>>
>>> Sorry for not returning to this earlier, been battling with windows 64
>>> bit odbc drivers.
>>>
>>> 2010/10/22 Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>>
>>> Hi Lars, Ola and all
>>>>
>>>> Thinking about exporting pivot tables .. we need to pen up some
>>>> requirements which we can get into a blueprint.  While we are
>>>> considering that, I've given some thoughts below.
>>>>
>>>> My initial thought was that we would construct a pivot table with a
>>>> jdbc connection like we would normally do manually, and then simply
>>>> populate the pivot cache so that the spreadsheet would be
>>>> "free-standing".  On reflection that is probably neither the best nor
>>>> the simplest thing to do.  Playing around with excel (in wine :-)  I
>>>> figured it is actually much more straightforward to have two sheets;
>>>> one for the data and one for the pivot table.  The pivot table would
>>>> operate on the data in the other worksheet.  This is more explicit
>>>> than simply hiding the data in the pivot cache.  Though I need to test
>>>> this a bit more.  If the pivotcache is an obligatory data store then
>>>> it might not make sense to carry the data twice.  The pivot table
>>>> constructs are fairly complex but not outlandishly so - once I'm done
>>>> unpicking, I will write up a short description of the anatomy of a
>>>> pivot table so we can see what needs to be done.
>>>>
>>>>
>>> This sounds just fine to me.
>>>
>>
>> This might work, just have in mind that we will then need several pairs of
>> data+pivot worksheets as we want to have multiple pivot tables in an Excel
>> file.
>> Also have in mind that we can talk about up to around 1.500.000 rows
>> (that's when my laptop starts to complain at least) of data here, so these
>> data sheets will be large. Typically, number of rows of data will be
>> somewhere around 200-600K.
>>
>>
>>
>>
>>>
>>>
>>>> One possibility would be that the pivot table could be generated as
>>>> part of the export of a report table ie. optionally export to excel or
>>>> export to excel-with-pivot.
>>>>
>>>> Given that there is no advance indication of the columns in a report
>>>> table, a challenge will be how and when to define the pivot model -
>>>> ie. the pivot fields -  and how to persist that model (we don't want
>>>> users to have to recreate the pivot model each time).  This will
>>>> probably require an extra object in our data model (PivotModel) which
>>>> defines the row and column pivot fields and data field, where a
>>>> ReportTable can have a 0-* relationship with 0 or more PivotModels.  A
>>>> possible beneficial side effect of this could be that we also leave
>>>> open the ability to generate the pivot table rendition with things
>>>> other than excel.  I don't see this as an immediate requirement but
>>>> useful to have the pivotmodel abstraction anyway.  In reality the
>>>> design of the pivotmodel will be based on what excel requires.
>>>>
>>>> Generating the excel spreadsheet off the reporttable+pivotmodel will
>>>> produce an OOXML xlsx file - actually a bundle of xml streams which
>>>> need to be zipped.  Its a bit unfortunate that M$ defines a flat file
>>>> single xml for other OPC office documents, but sadly not excel.
>>>> Dealing with a single flat file is much easier than zip containers
>>>> with multiple streams but so be it.  At least they end up smaller.
>>>> Though given that these reporttables can be very large I don't think
>>>> an xslt approach is really the most efficient at least for producing
>>>> the data worksheet.  It might make sense if the source data was xml,
>>>> but even then probably not.  So simple iteration through the table
>>>> rows with a stax writer will work best.  The second sheet (the pivot
>>>> sheet) would just be a serialization of the pivot model.
>>>>
>>>> We will probably have to implement the excel optimization of
>>>> sharedstrings (where literal strings are not written into the
>>>> spreadsheet - just index values from a table).  This adds complexity
>>>> but I think we are talking pretty large files here.  If we don't do
>>>> the sharedstrings optimization, Excel will do it anyway the first time
>>>> it opens and saves the file, but I suspect we will have to do it up
>>>> front.
>>>>
>>>> Regarding UI and PivotModel I want to take a look at our existing work
>>>> on web pivot tables to see whether we can't leverage some of this.  If
>>>> we follow roughly the outline above the user would have the option to
>>>> define 0 or more pivottable definitions at the time of creating, or
>>>> editing, a reporttable.  What do you think?  Is the above close to
>>>> what you guys have in mind.  Lets hammer out some requirements and
>>>> create the blueprint.
>>>>
>>>>
>>> We won't base this on report tables as (at least I) can't see any huge
>>> benefits. The excel pivot tables themselves are more capable of having
>>> dimensions on columns, rows, filters than our report tables. In fact we
>>> usually use a SQL view as datasource for the pivot tables - which has a
>>> fixed number of columns. The view is based on the aggregateddatavalue table
>>> and joins in the dataelement, orgunit, period and periodtype tables. In
>>> addition we join in a series of resource tables to get information about the
>>> orgunit structure, dataelement groupset structure, orgunit groupset
>>> structure (additional dimensional information!) to make the pivot table more
>>> valuable for analysis. (This also goes for indicators.)
>>>
>>> Since this should be more of a offline analysis tool I think exporting
>>> all dataelements/indicators will be most appropriate. The user interface
>>> could simply request a start and endate (and/or relative periods),
>>> indicators vs dataelements, parent organisation unit and organisation unit
>>> level. Ola will maybe have some views here...
>>>
>>> Agree with Lars, I don't think report tables should be used as source for
>> pivot tables, at least not for the standard "ALL data" pivot tables. For
>> more ad-hoc use the pivots generated off a report table might be useful as
>> an export option, but let's start with the most common use cases, the big
>> pivot tables with all the data for browsing and data analysis.
>>
>> The pivot source views (sql) that we have used e.g. in the newly
>> distributed pivots for the demo database can be found in an email to the
>> list on Sep 21, together with the pivots and the database dumps:
>> http://www.mail-archive.com/dhis2-devs@xxxxxxxxxxxxxxxxxxx/msg07341.html (from
>> Sep 21)
>>
>> These views have been stable for a long time and describe the fields of
>> the basic pivot tables that are needed for DHIS data analysis.
>> These set of fields go back as far as DHIS 1.3, so the basic pivot tables
>> are stable and we know their structure, basically the core data dimensions
>> of DHIS2 (see chapter 2 in user manual).
>>
>> What always will be variables are the orgunit level of aggregation for the
>> data or indicator values and the period type. You will see these two are
>> specified at the end of every where clause in the pivot views. These are
>> used as a filters when fetching data from the data mart.
>>
>> If you open the pivot table for download in that email you can follow my
>> reasoning below:
>> Typically we have two tables for monthly (raw data+indicator data) for the
>> lowest level plus at least two more for a higher level. E.g. in the demo
>> based on SL we use the health facility level (level 4) and the Chiefdom
>> level (3). The reason for using Chiefdom level as well is that these tables
>> are much faster to manipulate since there are a lot less data when shaving
>> away the lowest level. How many levels and which levels to put in the pivot
>> table will vary from place to place. In addition to these at least four
>> tables,  there is often at least one table for population data, which has a
>> yearly period type and therefore needs a separate view and table. So for the
>> demo database we have 5 basic tables which covers all the data for data
>> elements and indicators. This set of pivot tables is what I would define as
>> a minimum pivot table setup for a DHIS database. Over time (depending on the
>> amount of data collected) it might be necessary to split up these basic
>> tables by year since they can get too big for excel to handle. Such a split
>> can be facilitated by a simple from and to filter on the start date in the
>> period field in the sql.
>>
>> Then there can be many more customised pivot tables which make use of
>> additional dimensions to the data like the data element categories, and data
>> element and orgunit group sets. The simplest approach there is to (still
>> talking sql) do a select * from the various resource tables. Then you will
>> have a data source for all other customised tables (using the same period
>> type and aggregation level). Working with data element categories in pivot
>> tables it doesn't make sense to use too many at the same time, and rather
>> work on data elements that share the same categories (within or across
>> datasets). You can see in the demo pivot table file that we have set up
>> multiple tables making use of data element categories (EPI, RCH, HIV etc.).
>> These are all using the same source data (chiefdom level monthly raw data),
>> but are using different pivot fields (columns in the result set). The total
>> number of available fields are the same for all these custom chiefdom
>> tables, but we are only using a small subset of these fields in each table.
>> This means that the same data source could also be used for the basic tables
>> (the two chiefdom tables at least), and just select even fewer fields in the
>> table. The users can also easily add more fields to a table in Excel, using
>> drag and drop functionality.
>>
>> It seems to me that there is a need to separate between the source data
>> and the pivot structure, to reuse the source data in multiple tables. I
>> think that is how Excel is thinking too, but it doesn't always seem to work
>> that way, e.g. when creating a copy of an existing worksheet with a pivot
>>  table.
>> I am not sure how much of this customisation we need to support inside
>> DHIS2 and how much we can leave to Excel (which has nice tools for copying
>> tables, drag and drop pivot fields etc.). First we need to come up with
>> something that works for the basic tables and then look at these more custom
>> use cases.
>>
>> When it comes to pivot tables using period types other than monthly or
>> yearly (both covered in the basic tables listed above), like six-monthly,
>> quarterly or weekly these are also important to support. The pivot fields
>> are the same as in the basic tables, it is just a different periodtype
>> filter on the data source, so these might be easier to accommodate.
>>
>> Looking at the next steps, to me the key new features that I would like to
>> see with regards to pivot tables are:
>>
>> 1) a pivot table update service, where users that are on the online
>> server (no local install) can update their local pivot tables against the
>> online DHIS server
>> Since a pivot table can contain months and years of data we do not want to
>> download the full tables every month, but simply append the latest month of
>> data to the existing local pivot tables
>>
>>  2) a pivot generator, where the users can download a new pivot table
>> file directly off a DHIS2 server without having to do any local
>> configuration etc. , and then later use the update service to update their
>> tables
>>
>> with 1) being the most critical and urgent, but I realise that we might
>> need 2) first, in order to deal with a standard set of tables and pivot data
>> sources in 1).
>>
>> Ola
>> ---------
>>
>>
>>> When it comes to SQL views we have recently implemented a function for
>>> this where SQL views can be persisted as an application object and
>>> re-generated at any time. This means we can compile in a few default SQL
>>> views which can be used for this in DHIS. As you say an iteration over the
>>> rows in this view with eg. a stax writer would work.
>>>
>>> Ola/I have made some samples.
>>>
>>> Demo database with SQL views:
>>> http://folk.uio.no/larshelg/files/dhis2sl.backup
>>> Small pivot table based on the aggregateddatavalue orgunit level 2 view:
>>> http://folk.uio.no/larshelg/files/dataou2.xlsx
>>> Bigger pivot table based on the aggregateddatavalue orgunit level 3
>>> view: http://folk.uio.no/larshelg/files/dataou3.xlsx
>>>
>>> If you want to create more pivot tables install postgres on windows
>>> here: http://www.postgresql.org/download/
>>> Then install psql odbc windows here:
>>> http://www.postgresql.org/ftp/odbc/versions/msi/
>>> You can create a data source directly from Excel by going to Data - From
>>> other sources - From Microsoft Query and choosing the postgres odbc driver
>>> under 2. (Stay clear of x64! :)
>>>
>>> Your thoughts on shared string optimizations and zipped stream bundles
>>> sound fine to me, in any case you know better here:)
>>>
>>> regards, Lars
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>

References