← Back to team overview

dhis2-devs team mailing list archive

Re: import thread

 

Hi Jason

Thanks for the detailed response.  I'm trying to look right now at
incremental improvement of the current process by providing better
feedback to users - running a single thread instead of the background
cave process can provide a quick win here.

More generally, I would like to move completely to a dxf2 way of
importing data where the importing of metadata is separate as per the
blueprint.  Currently there are two issues to solve:
(i) dimensions of data - short term we can continue to use
categoryoptioncombo between dhis instances or take the plunge with
concepts
(ii) harmonisation of uids - this is a process issue.  For the data
exchange to work, the sending system (typically lower in the reporting
hierarchy) will have to import metadata from the higher (national?)
system in order to populate its uids.

I've also given some consideration to a dedicated log ... haven't
decided yet.  Having an import log file would be fine but I am
wobbling over two questions: (i) in general we would prefer to use the
database for logging and (ii) it's not clear whether we want to
persist these import logs or whether the requirement is just for the
results of the most recent import.  Possibly the solution here is to
configure some sort of custom jdbc appender.

Thinking again more generally, there are also scenarios of
applications potentially pushing data (rather than pulling a file
through the UI) as well as any number of other endpoint types if we
ever use an integration framework such as apache camel

In answer to your other question, the semantics around failure is
generally not well defined and needs to be addressed.

Bob


On 7 February 2012 04:45, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
> Hi Bob,
>
> As usual I have an opinion.
>
> First and foremost, the import process as I think everyone knows, is rather
> fragile. I am glad to see a few of the issues being fixed (thrown, uncaught
> exceptions!). In distributed systems, we do not have full control over the
> metadata of course as has been outlined in the blueprint for
> 2.7 https://blueprints.launchpad.net/dhis2/+spec/separation-of-meta-data-and-data-values.
> This is of course even more the case in data warehousing scenarios, where we
> may have many slightly different versions of DHIS2 out there, with similar,
> but perhaps slightly different metadata. The situation in Nigeria is a good
> example of this, where we have multiple parties running DHIS2. The metadata
> is quite similar in many of these systems, but not 100% the same. One of
> course would need to be very careful about importing data in this situation,
> and clear user feedback would be very important to try and understand
> exactly what is going to happen before and during an import.
>
> Not really knowing the background of why separate threads were spawned from
> the beginning, it is hard for me to comment here, however, I really like the
> option in DHIS 1.4 which provides an option of viewing a report after an
> import. In 1.4 of course, the operation is synchronous, and can take an
> exceedingly long time, so there can be a reason for the user not to wait for
> this process to finish. However, given the fragility of the process in
> DHIS2, I usually sit and monitor the log in real time to see what is
> happening. Of course, this may not be appropriate or useful for most users,
> but regardless of how it is done, I think having the option to view a report
> would be a very useful piece of functionality. Because of the fact that so
> little information is provided to the user and its noted fragility, I
> normally end up doing the import of data, although eventually, this
> operation should be delegated to the actual owners of the system, and not an
> external consultant. At least in this scenario, the users could attempt to
> do the imports and then provide the detailed log to "tech support", which
> might be an administrator, consultant, or the mailing list when something
> goes awry.
>
> One (possible) easy solution would be a dedicated log, which we could
> configure using log4j, similar to the audit log. At least this
> would separate the import process away from the main log, and might make
> things a bit easier to diagnose. Of course, having some sort of  log reading
> module like OpenMRS (as we discussed in our chat the other day) would make
> the retrieval of such a log a lot easier.
>
> One final question for me is what state the database is left in after an
> aborted import. I assume there is not a BEGIN/COMMIT or SAVEPOINT block on
> the database which is started at the beginning of the  import process? If we
> get half way down the import, and something fails, is everything rolled back
> to the state the database was to begin with, is the partial import
> committed?
>
> Thanks for looking into this.
>
> Best regards,
> Jason
>
>
> On Mon, Feb 6, 2012 at 10:32 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>
>> One of the difficulties of providing good user feedback on import is
>> the that we spawn an extra thread to do the actual importing and we
>> don't have very sophisticated inter-thread communication with that
>> worker beyond the "status message" which is a transient thing.  And
>> better logging is not a substitute for user feedback.
>>
>> Is there a really compelling reason to spawn this extra thread?  Doing
>> the import synchronously (in the same thread as the action) would make
>> it much simpler to provide progressive and useful feedback to the
>> user.  There is a general principle in UI design that you want to keep
>> the UI responsive during long-running operations but I am not sure
>> that should necessarily be the case here.  It's more important to have
>> better feedback and you actually want the user to wait until the
>> process is complete.  Of course this can be done between two threads
>> but it seems kind of unnecessary - and anyway the assumption would
>> still be that the user does not navigate away from the page while the
>> import is continuing.  Does anyone have an opinion?
>>
>> The other alternative would be to progressively build up (and store?)
>> a report of happenings during the import process and allow the user to
>> browse back through previous imports.  This can be nice but more
>> complex than just running a synchronous thread.
>>
>> Bob
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>
>


Follow ups

References