← Back to team overview

dhis2-users team mailing list archive

Re: [Dhis2-devs] Creation of CategoryOptionCombinations

 

Hi Uwe,

I agree ten dimensions is not much per se, but you might say 10 categories
for disaggregation per data element is a lot. Would it be possible to
redesign the model a bit, and rely more on data element group sets + groups
where you classify your data elements? This as opposed to having everything
as categories / options.

5 million option combos I think will in any case take some time to generate
and maintain. If you are willing to share some more info on your use-case
perhaps someone can offer some views.

regards,

Lars


On Tue, Jun 7, 2016 at 12:28 PM, Morten Olav Hansen <morten@xxxxxxxxx>
wrote:

> Hi Uwe
>
> The improvements are mainly for speed and validation. Yes, we are now (in
> 2.24) introducing versioned web-api, so that endpoint importer will be
> available until 2.26 (we will support 3 versions). In 2.24, the same
> endpoint is available at /api/24/metadata.
>
> If you are using cURL, or another utility.. the import part would be the
> same, but the UI in 2.23 can not be used, as it's hardcoded to legacy
> importer.
>
> --
> Morten Olav Hansen
> Senior Engineer, DHIS 2
> University of Oslo
> http://www.dhis2.org
>
> On Tue, Jun 7, 2016 at 11:25 PM, Uwe Wahser <uwe@xxxxxxxxx> wrote:
>
>> Hi Morten,
>>
>> no, i didn't. What would be the procedure for that? Importing Categories,
>> Options and CategoryCombinations via api and having DHIS2 generate the
>> CategoryOptionCombinations? Would that bring about any change at all or
>> does the
>> importer use different libs for generating the COCs?
>>
>> btw. is the 23 in the api link valid for future dhis2 versions? I noticed
>> it in
>> a few api descriptions recently ...
>>
>> Regards, Uwe
>>
>> > Morten Olav Hansen <morten@xxxxxxxxx> hat am 7. Juni 2016 um 18:50
>> > geschrieben:
>> >
>> >
>> > Hi Uwe
>> >
>> > Did you try out new importer? Available as /api/23/metadata in 2.23
>> >
>> > On Tuesday, 7 June 2016, Uwe Wahser <uwe@xxxxxxxxx> wrote:
>> >
>> > > Dear devs,
>> > >
>> > > I am experiencing problems when handling category combinations. Our
>> > > protoype
>> > > with 5 dimensions went through the process of generating
>> > > categoryOptionCombinations (~20.000 records) quite well. 7 dimensions
>> > > (~400.000)
>> > > worked as well, although it took a very long time.
>> > >
>> > > Now we defined the next datamodel with 10 dimensions (expecting ~5Mio
>> > > categoryOptionCombinations) and the process dies without further
>> notice.
>> > > Last
>> > > words in catalina.out:
>> > > * INFO  2016-06-07 13:29:33,783 Building object-bridge maps
>> (preheatCache:
>> > > true,
>> > > 3 classes). (DefaultObjectBridge.java [http-bio-8180-exec-15])
>> > > * INFO  2016-06-07 13:29:36,779 Building object-bridge maps took 2.99
>> > > seconds.
>> > > (DefaultObjectBridge.java [http-bio-8180-exec-15])
>> > > * INFO  2016-06-07 13:29:36,896 'admin' update
>> > > org.hisp.dhis.dataelement.DataElementCategoryCombo, name: Membership,
>> uid:
>> > > SCgLXYHqVzz (AuditLogUtil.java [http-bio-8180-exec-15])
>> > >
>> > > Ten dimensions with not extraordinarily big option sets is actually
>> not
>> > > unusual
>> > > and rather slim for multi-dimensional data-models in data warehouses,
>> so
>> > > I'd
>> > > expect DHIS2 to be able to handle this easily.
>> > >
>> > > Could of course be a memory problem (tried up to 14g for tomcat on a
>> 4-core
>> > > Ubuntu 14.04 server, DHIS 2.23) Before I'll start experimenting with
>> other
>> > > parameters, I am hoping to get some hints on known limitations or
>> > > workarounds
>> > > from you (not allowed: reducing the number of options or categories,
>> > > sql-hacks
>> > > :-) ). Is there any info on whether optimizations on this process are
>> being
>> > > planned in the kernel?
>> > >
>> > > Some observations on the process:
>> > >
>> > > * during generation (either when saving the categoryCombination or in
>> the
>> > > data
>> > > maintenance menu):
>> > > - long names - cOCs are generated with generated names that are
>> getting
>> > > extremely long as they are mere concats of the involved
>> categoryOptions.
>> > > Could
>> > > there be an option to just use the codes as basis or to leave away the
>> > > names
>> > > completely? Could be one reason for a memory problem and performance
>> > > issues.
>> > > - long log entries - every single entry is logged in catalina.out with
>> > > several
>> > > lines of text, causing catalina to become extremely big.
>> > > - during execution lots of Java-memory are being used and no
>> DB-memory,
>> > > which
>> > > looks to me as if all the logic is happening in the java machine. It
>> might
>> > > be
>> > > more usefull to transfer more logic into SQLs to the DB (e.g. use DB
>> > > cross-joins
>> > > for combining options) as the DB will be more efficient.
>> > > - because of the log entries I assume that every single combination is
>> > > being
>> > > persisted into the DB with a single SQL statement, causing millions of
>> > > single
>> > > SQL requests. Prefer batch SQL instead of single record processing.
>> > >
>> > > * during import/export of categoryOptionCombinations:
>> > > - prefer batch SQL instead of single record processing
>> > > - huge log entries in catalina.out due to several lines of text per
>> > > combination
>> > >
>> > > I'd be very happy about comments.
>> > >
>> > > Thanks in advance,
>> > >
>> > > Uwe
>> > >
>> > > _______________________________________________
>> > > Mailing list: https://launchpad.net/~dhis2-users
>> > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx <javascript:;>
>> > > Unsubscribe : https://launchpad.net/~dhis2-users
>> > > More help   : https://help.launchpad.net/ListHelp
>> > >
>> >
>> >
>> > --
>> > --
>> > Morten Olav Hansen
>> > Senior Engineer, DHIS 2
>> > University of Oslo
>> > http://www.dhis2.org
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
lars@xxxxxxxxx
http://www.dhis2.org <https://www.dhis2.org/>

Follow ups

References