← Back to team overview

dhis2-devs-core team mailing list archive

Re: ADX data import proposal

 

Hi Lars

The problem is the dataValuSetService requires an an inputstream to
feed off.  There are only 2 ways to provide an inputstream that I can
think of.  Either create a pipe or buffer (eg with a string).

Creating a pipe is doable but then you also need to create a separate
thread to read it which is another resource to manage (eg with a pool)
but that seemed like more effort than it is worth.

What I can do short term as a defensive measure is to place a limit on
the number of datavalues which can be buffered for a single
datavalueset.  That way it should not be possible to explode the
memory.  I'll do that soon.

Note that in "normal" use this should not be a problem as a single adx
group corresponds to the data for one orgunit, for one period - what
is envisaged typically is a single dataset's worth.

The other "alternative" is not to use the datavalueSetService at all
but just duplicate the code.

Bob

On 18 June 2015 at 15:22, Lars Helge Øverland <larshelge@xxxxxxxxx> wrote:
> Hi Bob,
>
> as you say this creates a hard limit on memory. Now all it will take to
> bring down a DHIS 2 instance is now to submit a sufficiently large import
> file. Seems like this will provide head-aches for server admins ;) Can we
> find a stream-based solution which scales well?
>
> Lars
>
>
> On Thu, Jun 18, 2015 at 2:49 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>
>> WIP committed and slight adjustment of strategy ...
>>
>> I was not comfortable with creating a new thread just to pipe from adx to
>> dxf.
>>
>> So instead, for each adx group corresponding to a dataValueSet with
>> orgUnit, period (and potentially atributeOptionCombo), I create a
>> dataValueSet DOM document and present that to the dxf2 stream importer
>> as a stream.  Given that this data is bound by a single orgunit and
>> period I don't think the DOM document is going to break the memory
>> bank.
>>
>> Basic conversion to dxf2 is working fine.
>>
>> Next task is to "implode" the categories.
>>
>> A luta Continua.
>>
>> On 12 June 2015 at 13:40, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>> > Hi
>> >
>> > As yoou have seen I have already started to commit a few bits of code
>> > in support of the ADX implementation.  I hadn't been planning to do
>> > this so will proceed quite slowly, but let me outline the approach I
>> > am considering for your comment and suggestion.
>> >
>> > 1.  Currently we have a datavaueset service which can import dxf2 data
>> > from an inputstream.
>> >
>> > 2.  I would like to use that existing service and place the adx
>> > service as a thin veneer above it rather than create a lot of
>> > duplicated code.
>> >
>> > 3.  The adx data importer would read its adx input from a stream and
>> > convert that into a dxf2 stream.  The main tasks it would need to
>> > perform are:
>> > (i)  convert periods into dxf2 format
>> > (ii) lookup catoptcombos and attributeoptioncombos for the dimensions
>> > in the adx message
>> > All other attributes and ImportOptions would be passed through
>> > directly to the dxf2 datavalueset service.
>> >
>> > 4.  In order to present the resulting dxf2 to the service as an
>> > InputStream it would have to use PipeReader/PipeWriter combination
>> > (Something Lars will recall from earlier dxf1 code).  The equivalent
>> > alternative would be to post the dxf2 datasets backout to the REST
>> > endpoint but that seems wasteful and more awkward.
>> >
>> > Does that approach sound reasonable?
>> >
>> > I have some lingering uncertainty about the best way to deal with
>> > ImportSummary.  The adx data is naturally grouped by orgunit/period.
>> > So I would likely split the stream and post each as a separate dxf2
>> > datavalueset.  So probably this would imply collecting the results
>> > into an <ImportSummaries ... /> element.  ADX is currently silent on
>> > the result message as it deliberately does not define the transaction
>> > (just the message) so we have some latitude here to do whatever is
>> > best.  The above is my best suggestion.
>> >
>> > Cheers
>> > Bob
>>
>> --
>> Mailing list: https://launchpad.net/~dhis2-devs-core
>> Post to     : dhis2-devs-core@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs-core
>> More help   : https://help.launchpad.net/ListHelp
>
>
>
>
> --
> Lars Helge Øverland
> Lead developer, DHIS 2
> University of Oslo
> Skype: larshelgeoverland
> http://www.dhis2.org
>


Follow ups

References