← Back to team overview

dhis2-devs team mailing list archive

Re: Can we bump organisationunitid from int to bigint?

 

I agree, this is probably the best approach - especially if the ETL
tool can handle both the generalization, geojsonificaton and dress it
up as DXF (which I'm sure it can)

camp2camp's SpatialDataIntegrator on top of Talend seems promising:
http://www.talendforge.org/wiki/doku.php?id=sdi:version1_3_0

On Fri, May 21, 2010 at 12:16 PM, Jason Pickering
<jason.p.pickering@xxxxxxxxx> wrote:
> I would think that the recommended procedure would be to transform the
> source data to DXF. That way, we do not need to worry about the
> internals of the database. Is this not an option Knut? Using Talend or
> (Geo)Kettle for this purpose would certainly seem possible and
> relatively straightforward.
>
> I would think that this would be a better staging mechanism than a
> temporary table followed by direct insertion into the DB? It is more
> work, but it would be worth it, if this process would be documented.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On 5/21/10, Knut Staring <knutst@xxxxxxxxx> wrote:
>> You are quite right that we need to coordinate this with bulk orgunit
>> import - to some extent the hope is that if the admin boundaries we
>> have access to are reasonably current, the map import would obviate
>> the need for bulk orgunit import.
>>
>> It does seem best to route everything through the import functionality
>> more than the ad-hoc student code (if we have it), though perhaps we
>> can use good pieces of it. However, because of the particular issues
>> of simplification and conversion to geojson, I'm not sure it makes
>> sense to go via Excel. What is the current status, Bob? Does it make
>> sense to try and splice these processes now, or should we defer this?
>>
>> Knut
>>
>> On Fri, May 21, 2010 at 11:48 AM, Ola Hodne Titlestad
>> <olatitle@xxxxxxxxx> wrote:
>>> We have already a blueprint on importing orgunit hierarchies off
>>> structured
>>> excel files and this seems very related.
>>> I am sure Bob can do some transformation magic to the shapefile
>>> information
>>> using level ids and orgunit names to create orgunit hierarchies off these,
>>> either directly or we find a way to convert the shapefile info to the
>>> format
>>> of the structured excel format we recommend for bulk orgunit import.
>>>
>>> Ola Hodne Titlestad |Technical Officer|
>>> Health Metrics Network (HMN) | World Health Organization
>>> Avenue Appia 20 |1211 Geneva 27, Switzerland | Email:
>>> titlestado@xxxxxxx|Tel: +41 788216897
>>> Website: www.healthmetricsnetwork.org
>>>
>>> Better Information. Better Decisions. Better Health.
>>>
>>>
>>> On 21 May 2010 11:42, Jason Pickering <jason.p.pickering@xxxxxxxxx> wrote:
>>>>
>>>> I have been looking for this code for years, but apparently it was
>>>> "lost" when the server crashed a few years back.
>>>>
>>>> Would agree though doing it in Java would be a better idea.
>>>> Manipulating the DB is tricky and you need to be very careful, (for
>>>> instance, using the hibernate_sequence values for primary keys).  But
>>>> in the absence of this code, we may have to document situations where
>>>> bulk importation /conversion of external hierarchies is going to be
>>>> required.
>>>>
>>>> Regards,
>>>> JPP
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 5/21/10, Ola Hodne Titlestad <olatitle@xxxxxxxxx> wrote:
>>>> > Knut,
>>>> >
>>>> > Remember that we some years back had a student group working on
>>>> > converting
>>>> > shapefiles (basically a country set of level ids) to a DHIS orgunit
>>>> > hierarchy.
>>>> > Not sure where that code is now, but they certainly went through a
>>>> > similar
>>>> > process like the one you are describing.
>>>> > To me it seems it would be more robust to include this in the import
>>>> > module
>>>> > and take care of this hierarchy generation in the java code rather than
>>>> > manually manipulating the database.
>>>> >
>>>> > Ola
>>>> > --------
>>>> >
>>>> > On 21 May 2010 11:17, Jason Pickering <jason.p.pickering@xxxxxxxxx>
>>>> > wrote:
>>>> >
>>>> >> OK, you are all over the place on this one. Lets take it one at a
>>>> >> time.
>>>> >>
>>>> >> Use of the hibernate_sequence is a very good idea (as I found out the
>>>> >> hard way) as it is easy to use external tools to generate ids, but you
>>>> >> have no guarentee that this will not clash with something that DHIS2
>>>> >> inserts into the DB.
>>>> >>
>>>> >> I think you will need a staged approach. Dump everything into a
>>>> >> temporary table, and use the hibernate sequence to get a new primary
>>>> >> key. Use the parentID  (mapped to the code field)  to then update the
>>>> >> parentID field of the child. It should be pretty easily done, as the
>>>> >> parent/child relationship is implicit in the WHO levelid. If you
>>>> >> already have a parentID in the WHO information, then it should be even
>>>> >> easier.
>>>> >>
>>>> >> You do not need to use FME to generate the ID. Just use something
>>>> >> like..
>>>> >>
>>>> >> INSERT INTO organisationunit_temp (organisationunitid....)
>>>> >> VALUES (nextval('hibernate_sequence',....)
>>>> >>
>>>> >> You may need to remove the parentid primary key constraint during the
>>>> >> initial insert and then reconstruct them using an update statement. I
>>>> >> do not know exactly what the statement would be, but I was almost
>>>> >> certain I had written this before at some point in time .
>>>> >>
>>>> >> I do not think that this is enough of a justification to increase the
>>>> >> size of the organisationunitid field, as it should be big enough to
>>>> >> accommodate any realistic orgunit hierarchy.
>>>> >>
>>>> >> In general, I would suggest the use of a view to present to PostGIS
>>>> >> instead of directly linking to the table itself. Of course, there are
>>>> >> other problems with persisting views in DHIS2 which we are aware of,
>>>> >> but I do not anticipate that this table would ever be deleted, so it
>>>> >> should be pretty safe.
>>>> >>
>>>> >> Also, you may want to consider GeoKettle or Talend, as something that
>>>> >> could be integrated into DHIS2 for processing of the Geodata.
>>>> >>
>>>> >> Regards,
>>>> >> Jason
>>>> >>
>>>> >>
>>>> >> On 5/21/10, Knut Staring <knutst@xxxxxxxxx> wrote:
>>>> >> > Using the hibernate_sequence seems like a good idea in most cases,
>>>> >> > but
>>>> >> > for Orgunits it's really crucial to populate the parentid field
>>>> >> > (which
>>>> >> > of course would also have to change to bigint for this to make any
>>>> >> > sense).
>>>> >> >
>>>> >> > So while I agree that the original alphanumeric/string LVLID would
>>>> >> > fit
>>>> >> > well in the Code field, I need to be able to populate the hierarchy
>>>> >> > for the whole world from the database. I could conceivably come up
>>>> >> > with a script in FME to generate sequential IDs, but that seems
>>>> >> > quite
>>>> >> > complicated, and would also not use hibernate_sequence (possibly I
>>>> >> > just don't know enough about how to use that). I use FME mainly
>>>> >> > because I have not found a good alternative for simplifying polygons
>>>> >> > without causing cracks between them. It would in some ways be nice
>>>> >> > to
>>>> >> > be able to do everything in PostGIS which has functions like
>>>> >> > ST_AsGeoJSON and Simplify, but as you can see from the below link,
>>>> >> > the
>>>> >> > results are not quite satisfying:
>>>> >> >
>>>> >> > http://bostongis.org/PrinterFriendly.aspx?content_name=postgis_simplify
>>>> >> >
>>>> >> > mapshaper.org seem to have some of the same problems, which have
>>>> >> > been
>>>> >> > avoided by Bjørn Sandvik when he made these world datasets:
>>>> >> > http://thematicmapping.org/downloads/world_borders.php. The tool he
>>>> >> > used for simplfying is ArcToolbox Simplify Polygon tool (see page 19
>>>> >> > of this master thesis:
>>>> >> > http://thematicmapping.org/downloads/Thematic_Mapping_Engine.pdf).
>>>> >> > Unfortunately, FME and ArcToolbox are not integratable to DHIS2.
>>>> >> >
>>>> >> > While on this topic, I do think we perhaps need to add a LEVEL field
>>>> >> > to the ORGANISATIONUNIT table. That would make it quite
>>>> >> > corresponding
>>>> >> > to a PostGIS table (separatable on the LEVEL field in order to
>>>> >> > generate layers for Provinces, Districts etc). This is sort of
>>>> >> > available in the generated ORGUNITSTRUCTURE table, but that a) needs
>>>> >> > to be generated and b) seems a bit inefficient to have to join to
>>>> >> > another big table just to get the level. And perhaps we might want
>>>> >> > to
>>>> >> > have a separate table in DHIS2 with the full precision technologies
>>>> >> > and a link to the orgunit table.
>>>> >> >
>>>> >> > Knut
>>>> >> >
>>>> >> > On Fri, May 21, 2010 at 10:28 AM, Jason Pickering
>>>> >> > <jason.p.pickering@xxxxxxxxx> wrote:
>>>> >> >> I do not really have a problem with this, but shouldn't this
>>>> >> >> information go in the "code" field? Or is it a problem with the
>>>> >> >> number
>>>> >> >> of orgunits? It would seem unlikely that we would ever have more
>>>> >> >> than
>>>> >> >> 2,147,483,647 orgunits.
>>>> >> >>
>>>> >> >> Are you inserting the ID as the organisationunitid? This seems this
>>>> >> >> might cause problems with possible clashes with the
>>>> >> >> hibernate_sequence
>>>> >> >> which is used to generate IDs?
>>>> >> >>
>>>> >> >> I have run into this issue only once, but since then, I always use
>>>> >> >> the
>>>> >> >> hibernate_sequence to generate IDs when I directly insert data into
>>>> >> >> the DB.
>>>> >> >>
>>>> >> >> Regards,
>>>> >> >> Jason
>>>> >> >>
>>>> >> >>
>>>> >> >> On 5/20/10, Knut Staring <knutst@xxxxxxxxx> wrote:
>>>> >> >>> Hello,
>>>> >> >>>
>>>> >> >>> In the process of converting WHO identifiers for administrative
>>>> >> >>> boundaries (LVLID) to ids usable for DHIS2, I've run into the
>>>> >> >>> limits
>>>> >> >>> of the integer datatype we use in DHIS2.
>>>> >> >>>
>>>> >> >>> The LVLID is a three letter ISO code followed by 18 digits. We are
>>>> >> >>> converting the alphabetical ISO for the country to an ISO numeric
>>>> >> >>> code
>>>> >> >>> (preceeded by 1 to make it numeric).
>>>> >> >>>
>>>> >> >>> Would it be problematic to change the datatype for
>>>> >> >>> organisationunitid
>>>> >> >>> from integer to bigint?
>>>> >> >>>
>>>> >> >>> Knut
>>>> >> >>>
>>>> >> >>> _______________________________________________
>>>> >> >>> Mailing list:
>>>> >> >>>
>>>> >> >>> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>>> >> >>> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>> >> >>> Unsubscribe :
>>>> >> >>>
>>>> >> >>> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>>> >> >>> More help   : https://help.launchpad.net/ListHelp
>>>> >> >>>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> --
>>>> >> >> Jason P. Pickering
>>>> >> >> email: jason.p.pickering@xxxxxxxxx
>>>> >> >> tel:+260968395190
>>>> >> >>
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > --
>>>> >> > Cheers,
>>>> >> > Knut Staring
>>>> >> >
>>>> >>
>>>> >>
>>>> >> --
>>>> >> --
>>>> >> Jason P. Pickering
>>>> >> email: jason.p.pickering@xxxxxxxxx
>>>> >> tel:+260968395190
>>>> >>
>>>> >> _______________________________________________
>>>> >> Mailing list:
>>>> >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>>> >> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
>>>> >> Unsubscribe :
>>>> >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs>
>>>> >> More help   : https://help.launchpad.net/ListHelp
>>>> >>
>>>> >
>>>>
>>>>
>>>> --
>>>> --
>>>> Jason P. Pickering
>>>> email: jason.p.pickering@xxxxxxxxx
>>>> tel:+260968395190
>>>
>>>
>>
>>
>>
>> --
>> Cheers,
>> Knut Staring
>>
>
>
> --
> --
> Jason P. Pickering
> email: jason.p.pickering@xxxxxxxxx
> tel:+260968395190
>



-- 
Cheers,
Knut Staring



Follow ups

References