← Back to team overview

dhis2-users team mailing list archive

Re: Creation of CategoryOptionCombinations

 

Hi Uwe

Did you try out new importer? Available as /api/23/metadata in 2.23

On Tuesday, 7 June 2016, Uwe Wahser <uwe@xxxxxxxxx> wrote:

> Dear devs,
>
> I am experiencing problems when handling category combinations. Our
> protoype
> with 5 dimensions went through the process of generating
> categoryOptionCombinations (~20.000 records) quite well. 7 dimensions
> (~400.000)
> worked as well, although it took a very long time.
>
> Now we defined the next datamodel with 10 dimensions (expecting ~5Mio
> categoryOptionCombinations) and the process dies without further notice.
> Last
> words in catalina.out:
> * INFO  2016-06-07 13:29:33,783 Building object-bridge maps (preheatCache:
> true,
> 3 classes). (DefaultObjectBridge.java [http-bio-8180-exec-15])
> * INFO  2016-06-07 13:29:36,779 Building object-bridge maps took 2.99
> seconds.
> (DefaultObjectBridge.java [http-bio-8180-exec-15])
> * INFO  2016-06-07 13:29:36,896 'admin' update
> org.hisp.dhis.dataelement.DataElementCategoryCombo, name: Membership, uid:
> SCgLXYHqVzz (AuditLogUtil.java [http-bio-8180-exec-15])
>
> Ten dimensions with not extraordinarily big option sets is actually not
> unusual
> and rather slim for multi-dimensional data-models in data warehouses, so
> I'd
> expect DHIS2 to be able to handle this easily.
>
> Could of course be a memory problem (tried up to 14g for tomcat on a 4-core
> Ubuntu 14.04 server, DHIS 2.23) Before I'll start experimenting with other
> parameters, I am hoping to get some hints on known limitations or
> workarounds
> from you (not allowed: reducing the number of options or categories,
> sql-hacks
> :-) ). Is there any info on whether optimizations on this process are being
> planned in the kernel?
>
> Some observations on the process:
>
> * during generation (either when saving the categoryCombination or in the
> data
> maintenance menu):
> - long names - cOCs are generated with generated names that are getting
> extremely long as they are mere concats of the involved categoryOptions.
> Could
> there be an option to just use the codes as basis or to leave away the
> names
> completely? Could be one reason for a memory problem and performance
> issues.
> - long log entries - every single entry is logged in catalina.out with
> several
> lines of text, causing catalina to become extremely big.
> - during execution lots of Java-memory are being used and no DB-memory,
> which
> looks to me as if all the logic is happening in the java machine. It might
> be
> more usefull to transfer more logic into SQLs to the DB (e.g. use DB
> cross-joins
> for combining options) as the DB will be more efficient.
> - because of the log entries I assume that every single combination is
> being
> persisted into the DB with a single SQL statement, causing millions of
> single
> SQL requests. Prefer batch SQL instead of single record processing.
>
> * during import/export of categoryOptionCombinations:
> - prefer batch SQL instead of single record processing
> - huge log entries in catalina.out due to several lines of text per
> combination
>
> I'd be very happy about comments.
>
> Thanks in advance,
>
> Uwe
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx <javascript:;>
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help   : https://help.launchpad.net/ListHelp
>


-- 
-- 
Morten Olav Hansen
Senior Engineer, DHIS 2
University of Oslo
http://www.dhis2.org

Follow ups

References