← Back to team overview

dhis2-users team mailing list archive

Re: Creation of CategoryOptionCombinations

 

Hi Uwe

The improvements are mainly for speed and validation. Yes, we are now (in
2.24) introducing versioned web-api, so that endpoint importer will be
available until 2.26 (we will support 3 versions). In 2.24, the same
endpoint is available at /api/24/metadata.

If you are using cURL, or another utility.. the import part would be the
same, but the UI in 2.23 can not be used, as it's hardcoded to legacy
importer.

-- 
Morten Olav Hansen
Senior Engineer, DHIS 2
University of Oslo
http://www.dhis2.org

On Tue, Jun 7, 2016 at 11:25 PM, Uwe Wahser <uwe@xxxxxxxxx> wrote:

> Hi Morten,
>
> no, i didn't. What would be the procedure for that? Importing Categories,
> Options and CategoryCombinations via api and having DHIS2 generate the
> CategoryOptionCombinations? Would that bring about any change at all or
> does the
> importer use different libs for generating the COCs?
>
> btw. is the 23 in the api link valid for future dhis2 versions? I noticed
> it in
> a few api descriptions recently ...
>
> Regards, Uwe
>
> > Morten Olav Hansen <morten@xxxxxxxxx> hat am 7. Juni 2016 um 18:50
> > geschrieben:
> >
> >
> > Hi Uwe
> >
> > Did you try out new importer? Available as /api/23/metadata in 2.23
> >
> > On Tuesday, 7 June 2016, Uwe Wahser <uwe@xxxxxxxxx> wrote:
> >
> > > Dear devs,
> > >
> > > I am experiencing problems when handling category combinations. Our
> > > protoype
> > > with 5 dimensions went through the process of generating
> > > categoryOptionCombinations (~20.000 records) quite well. 7 dimensions
> > > (~400.000)
> > > worked as well, although it took a very long time.
> > >
> > > Now we defined the next datamodel with 10 dimensions (expecting ~5Mio
> > > categoryOptionCombinations) and the process dies without further
> notice.
> > > Last
> > > words in catalina.out:
> > > * INFO  2016-06-07 13:29:33,783 Building object-bridge maps
> (preheatCache:
> > > true,
> > > 3 classes). (DefaultObjectBridge.java [http-bio-8180-exec-15])
> > > * INFO  2016-06-07 13:29:36,779 Building object-bridge maps took 2.99
> > > seconds.
> > > (DefaultObjectBridge.java [http-bio-8180-exec-15])
> > > * INFO  2016-06-07 13:29:36,896 'admin' update
> > > org.hisp.dhis.dataelement.DataElementCategoryCombo, name: Membership,
> uid:
> > > SCgLXYHqVzz (AuditLogUtil.java [http-bio-8180-exec-15])
> > >
> > > Ten dimensions with not extraordinarily big option sets is actually not
> > > unusual
> > > and rather slim for multi-dimensional data-models in data warehouses,
> so
> > > I'd
> > > expect DHIS2 to be able to handle this easily.
> > >
> > > Could of course be a memory problem (tried up to 14g for tomcat on a
> 4-core
> > > Ubuntu 14.04 server, DHIS 2.23) Before I'll start experimenting with
> other
> > > parameters, I am hoping to get some hints on known limitations or
> > > workarounds
> > > from you (not allowed: reducing the number of options or categories,
> > > sql-hacks
> > > :-) ). Is there any info on whether optimizations on this process are
> being
> > > planned in the kernel?
> > >
> > > Some observations on the process:
> > >
> > > * during generation (either when saving the categoryCombination or in
> the
> > > data
> > > maintenance menu):
> > > - long names - cOCs are generated with generated names that are getting
> > > extremely long as they are mere concats of the involved
> categoryOptions.
> > > Could
> > > there be an option to just use the codes as basis or to leave away the
> > > names
> > > completely? Could be one reason for a memory problem and performance
> > > issues.
> > > - long log entries - every single entry is logged in catalina.out with
> > > several
> > > lines of text, causing catalina to become extremely big.
> > > - during execution lots of Java-memory are being used and no DB-memory,
> > > which
> > > looks to me as if all the logic is happening in the java machine. It
> might
> > > be
> > > more usefull to transfer more logic into SQLs to the DB (e.g. use DB
> > > cross-joins
> > > for combining options) as the DB will be more efficient.
> > > - because of the log entries I assume that every single combination is
> > > being
> > > persisted into the DB with a single SQL statement, causing millions of
> > > single
> > > SQL requests. Prefer batch SQL instead of single record processing.
> > >
> > > * during import/export of categoryOptionCombinations:
> > > - prefer batch SQL instead of single record processing
> > > - huge log entries in catalina.out due to several lines of text per
> > > combination
> > >
> > > I'd be very happy about comments.
> > >
> > > Thanks in advance,
> > >
> > > Uwe
> > >
> > > _______________________________________________
> > > Mailing list: https://launchpad.net/~dhis2-users
> > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx <javascript:;>
> > > Unsubscribe : https://launchpad.net/~dhis2-users
> > > More help   : https://help.launchpad.net/ListHelp
> > >
> >
> >
> > --
> > --
> > Morten Olav Hansen
> > Senior Engineer, DHIS 2
> > University of Oslo
> > http://www.dhis2.org
>

Follow ups

References