← Back to team overview

dhis2-devs team mailing list archive

Re: dhis14 import

 

Hi Jason

On 24 July 2010 15:25, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
> Hi Bob,
>
> I think this is indeed interesting, but is it really worth spending
> any time on? We currently have something that works, and for our 10
> million row database in Zambia it took a few hours to import. OK,
> fine, it took time, but it was a one off deal.  Is there much to be
> gained from a full refactor? I think the typical use case would be a
> one time bulk import, which more or less works. There are still some
> hiccups, but these are more a function of the differences in the
> metadata models between 1.4 and 2.0.
>
> IMHO, what does need to work, is bidirectional syncronization with XML
> between the two systems. Although it is theoretically possible to
> export data from 2.0 into 1.4, the last time i checked a few weeks
> back, it does not work. This funtionality, for me at least, is much
> much more important than simply being able to import DHIS 1.4 data
> without windows (which I would need in the first place to create 1.4
> data!).
>
> Not to be a party pooper or anything, but just trying to give some perspective.

I do agree I don't think its a priority.  Which is why I don't intend
to spend much time on it :-)

In general I'm not in a position to do much with dhis14 .. at least
now I do have a way of looking at dhis14 databases.  And of course
there is the famous modulo basico access database.

But I think the biggest challenge to get right is synchronization
(data and metadata exchange, versioning, governance etc) rather than
bulk import.  As I think you've mentioned elsewhere, this can always
be done by hook or by crook with the right tools and skills in hand.

So poop away .. fine by me :-)

Bob

>
> Regards,
> Jason
>
>
> On Fri, Jul 23, 2010 at 10:05 PM, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>> Hi Ime
>>
>> On 23 July 2010 19:23, Ime Asangansi <asangansi@xxxxxxxxx> wrote:
>>> Hi Bob,
>>>
>>> Mmmm..interesting.
>>>
>>> Just to be sure.
>>> Were you reading from 1.4 and writing to 2.0? mysql or postgres?
>>>
>>> Were you writing the other way (to access)?
>>
>> Nothing so grand as any of the above.  Though I am reading and writing
>> access.  Reading/writing mysql and postgres will be much more of a
>> bottleneck than reading jackcess 'coz of the tcp requirement of jdbc.
>>
>> In this scenario I'm just running scratch code in the background to
>> keep my laptop warm while I try to write :-)  See attached.  jackcess
>> is very maven friendly.
>>
>> mvn assembly:assembly
>> java -jar target/dhaccess-jar-with-dendencies.jar
>>
>> Drink lots of coffee, take a walk, have lunch .... etc
>>
>> Cheers.
>> Bob
>>
>>
>> package org.hisp.dhis;
>>
>>
>> import com.healthmarketscience.jackcess.ColumnBuilder;
>> import com.healthmarketscience.jackcess.DataType;
>> import java.io.File;
>> import java.io.IOException;
>> import java.util.Map;
>>
>> import org.apache.commons.logging.Log;
>> import org.apache.commons.logging.LogFactory;
>>
>> import com.healthmarketscience.jackcess.Database;
>> import com.healthmarketscience.jackcess.Table;
>> import com.healthmarketscience.jackcess.TableBuilder;
>>
>> /**
>>  * CPU warmer
>>  *
>>  */
>> public class App
>> {
>>
>>    private static final Log log = LogFactory.getLog( App.class );
>>
>>    private static long MAXROWS = 10000000;
>>
>>    public static void main( String[] args )
>>    {
>>        try
>>        {
>>            File dbFile = new File( "/home/bobj/src/dhaccess/test.mdb" );
>>            Database db = Database.create( dbFile );
>>            log.info( "Opened " + dbFile.getName() );
>>
>>            Map<String, Object> row = null;
>>            Table table = new TableBuilder( "test" ).addColumn( new
>> ColumnBuilder( "id", DataType.LONG ).toColumn() ).addColumn( new
>> ColumnBuilder( "value", DataType.TEXT ).toColumn() ).toTable( db );
>>
>>
>>            long i = 0;
>>            for ( i = 0; i < MAXROWS; i++ )
>>            {
>>                table.addRow( i, "some text" );
>>            }
>>
>>            log.info( "Reading data" );
>>            int count = 0;
>>            while ( ( row = table.getNextRow() ) != null )
>>            {
>>                count++;
>>                log.debug( row );
>>                Integer id = (Integer) row.get( "id" );
>>                String value = (String) row.get( "value" );
>>
>>                System.err.println( id + " : " + value );
>>            }
>>            log.info( count + " datavalues" );
>>        } catch ( IOException ex )
>>        {
>>            log.info("Ouch: "+ex);
>>        }
>>
>>    }
>> }
>>
>>
>>
>>
>>>
>>> Thanks.
>>>
>>> Ime
>>>
>>>
>>> --- On Fri, 7/23/10, Bob Jolliffe <bobjolliffe@xxxxxxxxx> wrote:
>>>
>>>> From: Bob Jolliffe <bobjolliffe@xxxxxxxxx>
>>>> Subject: Re: [Dhis2-devs] dhis14 import
>>>> To: "Lars Helge Øverland" <larshelge@xxxxxxxxx>
>>>> Cc: "Ime Asangansi" <asangansi@xxxxxxxxx>, "Knut Staring" <knutst@xxxxxxxxx>, "dhis2-devs" <dhis2-devs@xxxxxxxxxxxxxxxxxxx>
>>>> Date: Friday, July 23, 2010, 3:53 PM
>>>> Alright writing is pretty slow.
>>>> Writing 10million records take two hours.
>>>>
>>>> But reading is pretty darn quick.  Iterating through
>>>> 10million rows
>>>> and dumping each row to stdout (redirected to /dev/null)
>>>> takes about 1
>>>> minute +- 5 seconds.   Throughout which
>>>> memory use is constant and
>>>> low: less than 1.5%.  That's got to be quicker than
>>>> anything using
>>>> jdbc which is obliged to push everything up and down the
>>>> tcp/ip stack.
>>>>
>>>> Not looking any more at this now - just wanted to see the
>>>> numbers.
>>>> Looks like a good addition to Knut's student project
>>>> list.  Refactor
>>>> dhis14 file import to inject a jackcess backend in place of
>>>> the
>>>> hibernate one.
>>>>
>>>> Cheers
>>>> Bob
>>>>
>>>> 2010/7/23 Bob Jolliffe <bobjolliffe@xxxxxxxxx>:
>>>> > 2010/7/22 Lars Helge Øverland <larshelge@xxxxxxxxx>:
>>>> >>
>>>> >> No doubt this looks much simpler.
>>>> >> Would be interesting to do a test with a large
>>>> table (>10 mill) and see how
>>>> >> it performs in terms of memory usage.
>>>> >
>>>> > 10 million records is a lot of records!  I have it
>>>> whirring away in
>>>> > the background as I get on with other stuff.  Started
>>>> 1.5 hours ago
>>>> > and still writing records ... db file up to 250M ...
>>>> hasn't started
>>>> > reading back yet but thus far memory usage is low and
>>>> constant.  I'll
>>>> > let you know when/if it finishes :-)
>>>> >
>>>> >> Lars
>>>> >> On Wed, Jul 21, 2010 at 3:42 AM, Ime Asangansi
>>>> <asangansi@xxxxxxxxx>
>>>> wrote:
>>>> >>>
>>>> >>> Impressive!
>>>> >>> First time seeing that clean functionality!
>>>> >>> I see potential there to move data between
>>>> both systems :)
>>>> >>>
>>>> >>> Ime
>>>> >>>
>>>> >>>
>>>> >>> --- On Tue, 7/20/10, Knut Staring <knutst@xxxxxxxxx>
>>>> wrote:
>>>> >>>
>>>> >>> > From: Knut Staring <knutst@xxxxxxxxx>
>>>> >>> > Subject: Re: [Dhis2-devs] dhis14 import
>>>> >>> > To: "Bob Jolliffe" <bobjolliffe@xxxxxxxxx>
>>>> >>> > Cc: "dhis2-devs" <dhis2-devs@xxxxxxxxxxxxxxxxxxx>
>>>> >>> > Date: Tuesday, July 20, 2010, 3:32 PM
>>>> >>> > That sounds really great - it has
>>>> >>> > been problematic to require Windows for
>>>> this.
>>>> >>> >
>>>> >>> > k
>>>> >>> >
>>>> >>> > On Tue, Jul 20, 2010 at 3:23 PM, Bob
>>>> Jolliffe <bobjolliffe@xxxxxxxxx>
>>>> >>> > wrote:
>>>> >>> > > Just some throwaway code testing out
>>>> jackcess for
>>>> >>> > reading dhis14 (and
>>>> >>> > > potentially modulo basico files):
>>>> >>> > >
>>>> >>> > > http://pastebin.com/wMv1SZqq
>>>> >>> > >
>>>> >>> > > I'm pretty impressed.  It works
>>>> well and I suspect
>>>> >>> > also much faster
>>>> >>> > > than accessing via odbc/ibatis or
>>>> whatever it is.
>>>> >>> >  Never mind the
>>>> >>> > > nonsense of what this code actually
>>>> does - the point
>>>> >>> > is that it can
>>>> >>> > > iterate over access tables using
>>>> java (on ubuntu).
>>>> >>> >  Kind of nice.
>>>> >>> > >
>>>> >>> > > Cheers
>>>> >>> > > Bob
>>>> >>> > >
>>>> >>> > >
>>>> _______________________________________________
>>>> >>> > > Mailing list: https://launchpad.net/~dhis2-devs
>>>> >>> > > Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>> >>> > > Unsubscribe : https://launchpad.net/~dhis2-devs
>>>> >>> > > More help   : https://help.launchpad.net/ListHelp
>>>> >>> > >
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> > --
>>>> >>> > Cheers,
>>>> >>> > Knut Staring
>>>> >>> >
>>>> >>> >
>>>> _______________________________________________
>>>> >>> > Mailing list: https://launchpad.net/~dhis2-devs
>>>> >>> > Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>> >>> > Unsubscribe : https://launchpad.net/~dhis2-devs
>>>> >>> > More help   : https://help.launchpad.net/ListHelp
>>>> >>> >
>>>> >>>
>>>> >>>
>>>> _______________________________________________
>>>> >>> Mailing list: https://launchpad.net/~dhis2-devs
>>>> >>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>> >>> Unsubscribe : https://launchpad.net/~dhis2-devs
>>>> >>> More help   : https://help.launchpad.net/ListHelp
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
>
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+17069260025
>



References