dhis2-devs team mailing list archive
-
dhis2-devs team
-
Mailing list archive
-
Message #22447
Re: Migrating from MySQL to PostgreSQL
Thanks Saptarshi, this is very helpful.
On Tue, May 7, 2013 at 6:23 PM, Saptarshi Purkayastha <sunbiz@xxxxxxxxx>wrote:
> Hello DHIS users and devs,
>
> Apologies for the long email... Should have probably been multiple
> blogpost to reduce its length...
>
> I recently encountered a situation with a very large implementation of
> DHIS2 having problems generating data mart.
> Thus no reports were generated and only data entry was being done. I
> thought I'd share some of the experiences to solve these issues, so that it
> might be useful to other implementers.
> Some changes will be needed in the DHIS2 source, so sending this to the
> dev list also, where dev-related discussions can follow-up.
>
> While PostgreSQL is our recommended database, many implementations have
> also used MySQL.
> My findings clearly highlight that DHIS2 performs much better on
> PostgreSQL and there are also some bugs related to MySQL dialect.
> Total org units - 26303
> Total Monthly datasets - 9
> Total Daily dataset - 1
> Total Yearly dataset - 3
>
> The implementation has about 34 million datavalues (non-zero)... but I
> pruned it for my benchmarking. I added 1-million datavalues and ran the
> data mart.
> The results are from my fairly good laptop (quad-core i7; 8GB RAM; tuned
> JVM; tuned MySQL 5.5 (4GB RAM); tuned PostgreSQL 9.2 (4GB RAM); 240GB SSD)
> Using DHIS2 2.11. When doing MySQL benchmark turned off all services
> including postgres and vice versa.
> Java Opts: -Xmx3G -Xms768m -XX:MaxPermSize=512m
> Java version: 1.7.0_21 x64
> Java vendor: Oracle Corporation OS name: WindowsMySQL = datamart
> completed in 3hrs 46min 12sec
> PostgreSQL = datamart completed in 2hrs 5min 16sec
>
> So, it is obvious that PostgreSQL is doing datamart much faster. The
> advantages might scale better if larger number of datavalues
> One could argue MySQL 5.6 has many performance improvements, I didn't have
> time to explore that.
> The migration to PostgreSQL has some challenges. Following are steps I
> followed:
>
> - Take the mysqldump
> - replace bit(1) to tinyint(1) in the SQL file
> - You'll see that column names are camelCase. This is an issue because
> postgres will added a double quotes around to get case-sensitivity, which
> MySQL by default nicely excludes.
> So you'll have to make all column names to lowercase and remove the quote
> characters. I did this with a simple java program. There are 150-odd column
> names that need changes.
> - Used Navcat premium (trial version or SQLSquirrel also has this
> feature). "Data transfer" is the name of the feature that will move data
> from MySQL to Postgres
> - In MySQL non-standard use of boolean (which came only a few yrs back),
> its converted to smallint in Postgres. I wrote a JDBC program to change
> column type from smallint to boolean. A single table example is as follows
> that can be made into a looping procedure as well in pure PSQL.
> ALTER TABLE indicator ALTER COLUMN annualized TYPE boolean
> USING CASE WHEN annualized = 0 THEN FALSE
> WHEN annualized = 1 THEN TRUE
> ELSE NULL
> END;
> - remove NULL values from minimumvalue column of minmaxdatalement table
> DELETE from minmaxdatalement WHERE minimumvalue=NULL
> - remove NULL values from maximumvalue column of minmaxdatalement table
> DELETE from minmaxdatalement WHERE maximumvalue=NULL
> - remove NULL values from name column of relationshiptype table
> DELETE from relationshiptype WHERE name=NULL
> - blobs to bytea conversion is a mess and I had to truncate. Probably a
> JDBC based connector program will do better conversion, but I just
> truncated it and accepted the data loss to systemsetting and usersetting :-)
> *
> Devs*:
> We need to make all column names lowercase in hbm.xml files in code. This
> will ensure portability and is generally a good practice.
> We should also have a convention of using last_updated instead of
> lastUpdated in column names, as is the common practice.
>
> ---
> Regards,
> Saptarshi PURKAYASTHA
>
> My Tech Blog: http://sunnytalkstech.blogspot.com
> You Live by CHOICE, Not by CHANCE
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-devs
> Post to : dhis2-devs@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~dhis2-devs
> More help : https://help.launchpad.net/ListHelp
>
>
--
Regards,
Bharath Kumar. Ch
References