← Back to team overview

dhis2-devs-core team mailing list archive

Re: ADX data import proposal

 

Hi Bob,

as you say this creates a hard limit on memory. Now all it will take to
bring down a DHIS 2 instance is now to submit a sufficiently large import
file. Seems like this will provide head-aches for server admins ;) Can we
find a stream-based solution which scales well?

Lars


On Thu, Jun 18, 2015 at 2:49 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:

> WIP committed and slight adjustment of strategy ...
>
> I was not comfortable with creating a new thread just to pipe from adx to
> dxf.
>
> So instead, for each adx group corresponding to a dataValueSet with
> orgUnit, period (and potentially atributeOptionCombo), I create a
> dataValueSet DOM document and present that to the dxf2 stream importer
> as a stream.  Given that this data is bound by a single orgunit and
> period I don't think the DOM document is going to break the memory
> bank.
>
> Basic conversion to dxf2 is working fine.
>
> Next task is to "implode" the categories.
>
> A luta Continua.
>
> On 12 June 2015 at 13:40, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
> > Hi
> >
> > As yoou have seen I have already started to commit a few bits of code
> > in support of the ADX implementation.  I hadn't been planning to do
> > this so will proceed quite slowly, but let me outline the approach I
> > am considering for your comment and suggestion.
> >
> > 1.  Currently we have a datavaueset service which can import dxf2 data
> > from an inputstream.
> >
> > 2.  I would like to use that existing service and place the adx
> > service as a thin veneer above it rather than create a lot of
> > duplicated code.
> >
> > 3.  The adx data importer would read its adx input from a stream and
> > convert that into a dxf2 stream.  The main tasks it would need to
> > perform are:
> > (i)  convert periods into dxf2 format
> > (ii) lookup catoptcombos and attributeoptioncombos for the dimensions
> > in the adx message
> > All other attributes and ImportOptions would be passed through
> > directly to the dxf2 datavalueset service.
> >
> > 4.  In order to present the resulting dxf2 to the service as an
> > InputStream it would have to use PipeReader/PipeWriter combination
> > (Something Lars will recall from earlier dxf1 code).  The equivalent
> > alternative would be to post the dxf2 datasets backout to the REST
> > endpoint but that seems wasteful and more awkward.
> >
> > Does that approach sound reasonable?
> >
> > I have some lingering uncertainty about the best way to deal with
> > ImportSummary.  The adx data is naturally grouped by orgunit/period.
> > So I would likely split the stream and post each as a separate dxf2
> > datavalueset.  So probably this would imply collecting the results
> > into an <ImportSummaries ... /> element.  ADX is currently silent on
> > the result message as it deliberately does not define the transaction
> > (just the message) so we have some latitude here to do whatever is
> > best.  The above is my best suggestion.
> >
> > Cheers
> > Bob
>
> --
> Mailing list: https://launchpad.net/~dhis2-devs-core
> Post to     : dhis2-devs-core@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs-core
> More help   : https://help.launchpad.net/ListHelp
>



-- 
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org <https://www.dhis2.org>

Follow ups

References