← Back to team overview

dhis2-devs team mailing list archive

Re: Migrating from MySQL to PostgreSQL

 

Thanks Saptarshi, this is very helpful.


On Tue, May 7, 2013 at 6:23 PM, Saptarshi Purkayastha <sunbiz@xxxxxxxxx>wrote:

> Hello DHIS users and devs,
>
> Apologies for the long email... Should have probably been multiple
> blogpost to reduce its length...
>
> I recently encountered a situation with a very large implementation of
> DHIS2 having problems generating data mart.
> Thus no reports were generated and only data entry was being done. I
> thought I'd share some of the experiences to solve these issues, so that it
> might be useful to other implementers.
> Some changes will be needed in the DHIS2 source, so sending this to the
> dev list also, where dev-related discussions can follow-up.
>
> While PostgreSQL is our recommended database, many implementations have
> also used MySQL.
> My findings clearly highlight that DHIS2 performs much better on
> PostgreSQL and there are also some bugs related to MySQL dialect.
> Total org units - 26303
> Total Monthly datasets - 9
> Total Daily dataset - 1
> Total Yearly dataset - 3
>
> The implementation has about 34 million datavalues (non-zero)... but I
> pruned it for my benchmarking. I added 1-million datavalues and ran the
> data mart.
> The results are from my fairly good laptop (quad-core i7; 8GB RAM; tuned
> JVM; tuned MySQL 5.5 (4GB RAM); tuned PostgreSQL 9.2 (4GB RAM); 240GB SSD)
> Using DHIS2 2.11. When doing MySQL benchmark turned off all services
> including postgres and vice versa.
> Java Opts:  -Xmx3G -Xms768m -XX:MaxPermSize=512m
> Java version: 1.7.0_21 x64
> Java vendor: Oracle Corporation OS name: WindowsMySQL = datamart
> completed in 3hrs 46min 12sec
> PostgreSQL = datamart completed in 2hrs 5min 16sec
>
> So, it is obvious that PostgreSQL is doing datamart much faster. The
> advantages might scale better if larger number of datavalues
> One could argue MySQL 5.6 has many performance improvements, I didn't have
> time to explore that.
> The migration to PostgreSQL has some challenges. Following are steps I
> followed:
>
>  - Take the mysqldump
>  - replace bit(1) to tinyint(1) in the SQL file
>  - You'll see that column names are camelCase. This is an issue because
> postgres will added a double quotes around to get case-sensitivity, which
> MySQL by default nicely excludes.
> So you'll have to make all column names to lowercase and remove the quote
> characters. I did this with a simple java program. There are 150-odd column
> names that need changes.
>  - Used Navcat premium (trial version or SQLSquirrel also has this
> feature). "Data transfer" is the name of the feature that will move data
> from MySQL to Postgres
>  - In MySQL non-standard use of boolean (which came only a few yrs back),
> its converted to smallint in Postgres. I wrote a JDBC program to change
> column type from smallint to boolean. A single table example is as follows
> that can be made into a looping procedure as well in pure PSQL.
> ALTER TABLE indicator ALTER COLUMN annualized TYPE boolean
>     USING CASE WHEN annualized = 0 THEN FALSE
>            WHEN annualized = 1 THEN TRUE
>            ELSE NULL
>     END;
>  - remove NULL values from minimumvalue column of minmaxdatalement table
> DELETE from minmaxdatalement WHERE minimumvalue=NULL
>  - remove NULL values from maximumvalue column of minmaxdatalement table
> DELETE from minmaxdatalement WHERE maximumvalue=NULL
>  - remove NULL values from name column of relationshiptype table
> DELETE from relationshiptype WHERE name=NULL
>  - blobs to bytea conversion is a mess and I had to truncate. Probably a
> JDBC based connector program will do better conversion, but I just
> truncated it and accepted the data loss to systemsetting and usersetting :-)
> *
> Devs*:
> We need to make all column names lowercase in hbm.xml files in code. This
> will ensure portability and is generally a good practice.
> We should also have a convention of using last_updated instead of
> lastUpdated in column names, as is the common practice.
>
> ---
> Regards,
> Saptarshi PURKAYASTHA
>
> My Tech Blog:  http://sunnytalkstech.blogspot.com
> You Live by CHOICE, Not by CHANCE
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to     : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 

Regards,
Bharath Kumar. Ch

References