← Back to team overview

dhis2-users team mailing list archive

Re: [Dhis2-devs] Creation of CategoryOptionCombinations

 

Hi Jason,

importing aggregate date into data-sets (see my reply to Lars yesterday evening:
https://lists.launchpad.net/dhis2-users/msg10452.html)

Again: the problem is not the import, but the combination of category options.
Maybe it would already help a lot, if those bombastic strings for the names
wouldn't be created for categoryOptionCombinations.

Thanks for good ideas, 

Uwe

---
> Jason Pickering <jason.p.pickering@xxxxxxxxx> hat am 8. Juni 2016 um 09:09
> geschrieben:
> 
> 
> Hi Uwe,
> 
> Are you importing this as aggregate data or as events?
> 
> Regards,
> Jason
> 
> 
> On Wed, Jun 8, 2016 at 2:27 AM, Morten Olav Hansen <morten@xxxxxxxxx> wrote:
> 
> > Just to make sure, we are talking about the same thing: the problem does
> >> not
> >> appear during import, but when generating of all possible combinations
> >> (when
> >> saving the CategoryCombination or when manually evoking the update of
> >> categoryOptionCombinations)
> >>
> >
> > Ah, sorry.. I was thinking it was the import that was slow.. so that part
> > is ok?
> >
> >
> >> so I can still use /api/metadata without version to call the current
> >> api-version?
> >>
> >
> > That will give you the legacy importer, so going forward you would need to
> > use /api/{version}/{endpoint}, we will have more
> > info about it in the release notes.
> >
> > And no, the UI is not switched to new importer yet (in 2.24), not 100% it
> > will...
> >
> >
> >>
> >> Thanks for your replies at this time of the day :-)
> >>
> >> Regards, Uwe
> >>
> >> ---
> >>
> >>
> >> > Morten Olav Hansen <morten@xxxxxxxxx> hat am 7. Juni 2016 um 19:28
> >> > geschrieben:
> >> >
> >> >
> >> > Hi Uwe
> >> >
> >> > The improvements are mainly for speed and validation. Yes, we are now
> >> (in
> >> > 2.24) introducing versioned web-api, so that endpoint importer will be
> >> > available until 2.26 (we will support 3 versions). In 2.24, the same
> >> > endpoint is available at /api/24/metadata.
> >> >
> >> > If you are using cURL, or another utility.. the import part would be the
> >> > same, but the UI in 2.23 can not be used, as it's hardcoded to legacy
> >> > importer.
> >> >
> >> > --
> >> > Morten Olav Hansen
> >> > Senior Engineer, DHIS 2
> >> > University of Oslo
> >> > http://www.dhis2.org
> >> >
> >> > On Tue, Jun 7, 2016 at 11:25 PM, Uwe Wahser <uwe@xxxxxxxxx> wrote:
> >> >
> >> > > Hi Morten,
> >> > >
> >> > > no, i didn't. What would be the procedure for that? Importing
> >> Categories,
> >> > > Options and CategoryCombinations via api and having DHIS2 generate the
> >> > > CategoryOptionCombinations? Would that bring about any change at all
> >> or
> >> > > does the
> >> > > importer use different libs for generating the COCs?
> >> > >
> >> > > btw. is the 23 in the api link valid for future dhis2 versions? I
> >> noticed
> >> > > it in
> >> > > a few api descriptions recently ...
> >> > >
> >> > > Regards, Uwe
> >> > >
> >> > > > Morten Olav Hansen <morten@xxxxxxxxx> hat am 7. Juni 2016 um 18:50
> >> > > > geschrieben:
> >> > > >
> >> > > >
> >> > > > Hi Uwe
> >> > > >
> >> > > > Did you try out new importer? Available as /api/23/metadata in 2.23
> >> > > >
> >> > > > On Tuesday, 7 June 2016, Uwe Wahser <uwe@xxxxxxxxx> wrote:
> >> > > >
> >> > > > > Dear devs,
> >> > > > >
> >> > > > > I am experiencing problems when handling category combinations.
> >> Our
> >> > > > > protoype
> >> > > > > with 5 dimensions went through the process of generating
> >> > > > > categoryOptionCombinations (~20.000 records) quite well. 7
> >> dimensions
> >> > > > > (~400.000)
> >> > > > > worked as well, although it took a very long time.
> >> > > > >
> >> > > > > Now we defined the next datamodel with 10 dimensions (expecting
> >> ~5Mio
> >> > > > > categoryOptionCombinations) and the process dies without further
> >> > > notice.
> >> > > > > Last
> >> > > > > words in catalina.out:
> >> > > > > * INFO  2016-06-07 13:29:33,783 Building object-bridge maps
> >> > > (preheatCache:
> >> > > > > true,
> >> > > > > 3 classes). (DefaultObjectBridge.java [http-bio-8180-exec-15])
> >> > > > > * INFO  2016-06-07 13:29:36,779 Building object-bridge maps took
> >> 2.99
> >> > > > > seconds.
> >> > > > > (DefaultObjectBridge.java [http-bio-8180-exec-15])
> >> > > > > * INFO  2016-06-07 13:29:36,896 'admin' update
> >> > > > > org.hisp.dhis.dataelement.DataElementCategoryCombo, name:
> >> Membership,
> >> > > uid:
> >> > > > > SCgLXYHqVzz (AuditLogUtil.java [http-bio-8180-exec-15])
> >> > > > >
> >> > > > > Ten dimensions with not extraordinarily big option sets is
> >> actually not
> >> > > > > unusual
> >> > > > > and rather slim for multi-dimensional data-models in data
> >> warehouses,
> >> > > so
> >> > > > > I'd
> >> > > > > expect DHIS2 to be able to handle this easily.
> >> > > > >
> >> > > > > Could of course be a memory problem (tried up to 14g for tomcat
> >> on a
> >> > > 4-core
> >> > > > > Ubuntu 14.04 server, DHIS 2.23) Before I'll start experimenting
> >> with
> >> > > other
> >> > > > > parameters, I am hoping to get some hints on known limitations or
> >> > > > > workarounds
> >> > > > > from you (not allowed: reducing the number of options or
> >> categories,
> >> > > > > sql-hacks
> >> > > > > :-) ). Is there any info on whether optimizations on this process
> >> are
> >> > > being
> >> > > > > planned in the kernel?
> >> > > > >
> >> > > > > Some observations on the process:
> >> > > > >
> >> > > > > * during generation (either when saving the categoryCombination
> >> or in
> >> > > the
> >> > > > > data
> >> > > > > maintenance menu):
> >> > > > > - long names - cOCs are generated with generated names that are
> >> getting
> >> > > > > extremely long as they are mere concats of the involved
> >> > > categoryOptions.
> >> > > > > Could
> >> > > > > there be an option to just use the codes as basis or to leave
> >> away the
> >> > > > > names
> >> > > > > completely? Could be one reason for a memory problem and
> >> performance
> >> > > > > issues.
> >> > > > > - long log entries - every single entry is logged in catalina.out
> >> with
> >> > > > > several
> >> > > > > lines of text, causing catalina to become extremely big.
> >> > > > > - during execution lots of Java-memory are being used and no
> >> DB-memory,
> >> > > > > which
> >> > > > > looks to me as if all the logic is happening in the java machine.
> >> It
> >> > > might
> >> > > > > be
> >> > > > > more usefull to transfer more logic into SQLs to the DB (e.g. use
> >> DB
> >> > > > > cross-joins
> >> > > > > for combining options) as the DB will be more efficient.
> >> > > > > - because of the log entries I assume that every single
> >> combination is
> >> > > > > being
> >> > > > > persisted into the DB with a single SQL statement, causing
> >> millions of
> >> > > > > single
> >> > > > > SQL requests. Prefer batch SQL instead of single record
> >> processing.
> >> > > > >
> >> > > > > * during import/export of categoryOptionCombinations:
> >> > > > > - prefer batch SQL instead of single record processing
> >> > > > > - huge log entries in catalina.out due to several lines of text
> >> per
> >> > > > > combination
> >> > > > >
> >> > > > > I'd be very happy about comments.
> >> > > > >
> >> > > > > Thanks in advance,
> >> > > > >
> >> > > > > Uwe
> >> > > > >
> >> > > > > _______________________________________________
> >> > > > > Mailing list: https://launchpad.net/~dhis2-users
> >> > > > > Post to     : dhis2-users@xxxxxxxxxxxxxxxxxxx <javascript:;>
> >> > > > > Unsubscribe : https://launchpad.net/~dhis2-users
> >> > > > > More help   : https://help.launchpad.net/ListHelp
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > --
> >> > > > Morten Olav Hansen
> >> > > > Senior Engineer, DHIS 2
> >> > > > University of Oslo
> >> > > > http://www.dhis2.org
> >> > >
> >>
> >
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~dhis2-devs
> > Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~dhis2-devs
> > More help   : https://help.launchpad.net/ListHelp
> >
> >
> 
> 
> -- 
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+46764147049


Follow ups

References