← Back to team overview

dhis2-devs team mailing list archive

[Branch ~dhis2-documenters/dhis2/dhis2-docbook-docs] Rev 330: Added section on postgres performance tuning and section on database backups

 

------------------------------------------------------------
revno: 330
committer: Lars Helge Overland <larshelge@xxxxxxxxx>
branch nick: dhis2-docbook-docs
timestamp: Mon 2011-06-13 23:29:51 +0200
message:
  Added section on postgres performance tuning and section on database backups
modified:
  src/docbkx/en/dhis2_implementation_guide_installation.xml


--
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs
https://code.launchpad.net/~dhis2-documenters/dhis2/dhis2-docbook-docs

Your team DHIS 2 developers is subscribed to branch lp:~dhis2-documenters/dhis2/dhis2-docbook-docs.
To unsubscribe from this branch go to https://code.launchpad.net/~dhis2-documenters/dhis2/dhis2-docbook-docs/+edit-subscription
=== modified file 'src/docbkx/en/dhis2_implementation_guide_installation.xml'
--- src/docbkx/en/dhis2_implementation_guide_installation.xml	2011-04-28 10:51:50 +0000
+++ src/docbkx/en/dhis2_implementation_guide_installation.xml	2011-06-13 21:29:51 +0000
@@ -4,7 +4,7 @@
   <title>Installation</title>
   <para>The installation chapter provides information on how to install DHIS 2 in various contexts, including online central server, offline local network, standalone application and self-contained package called DHIS 2 Live.</para>
   <para>DHIS 2 runs on all platforms for which there exists a Java Runtime Environment version 6 or higher, which includes most popular operating systems such as Windows, Linux and Mac. DHIS 2 also runs on many relational database systems such as PostgreSQL, MySQL, H2 and Derby. DHIS 2 is packaged as a standard Java Web Archive (WAR-file) and thus runs on any Servlet containers such as Tomcat and Jetty.</para>
-  <para>The DHIS 2 team recommends Ubuntu 10.1.0 operating system, PostgreSQL database system and Tomcat Servlet container as the preferred environment for server installations. The mentioned frameworks can be regarded as market leaders within their domain and is heavily field tested over many years.</para>
+  <para>The DHIS 2 team recommends Ubuntu 10.10 operating system, PostgreSQL database system and Tomcat Servlet container as the preferred environment for server installations. The mentioned frameworks can be regarded as market leaders within their domain and is heavily field tested over many years.</para>
   <para>This chapter provides a guide for setting up the above technology stack. It should however be read as a guide for getting up and running and not as an exhaustive documentation for the mentioned environment. We refer to the offical Ubuntu, PostgreSQL and Tomcat documentation for in-depth reading.</para>
   <section>
     <title>Server setup</title>
@@ -17,7 +17,21 @@
     <para>Install PostgreSQL by invoking <code>sudo apt-get install postgresql-8.4</code></para>
     <para>Set the password for the postgres Unix user by invoking <code>sudo passwd postgres</code> and following the instructions. Switch to the postgres user by invoking <code>su postgres</code> and entering the password when prompted.</para>
     <para>Log into psql by invoking <code>psql</code> Create a user called <emphasis role="italic">dhis</emphasis>  by invoking <code>create user dhis with password &lt;dhis&gt;</code> Replace the password <emphasis role="italic">&lt;dhis&gt;</emphasis> with something secure. Create a database by invoking <code>create database dhis2 with owner dhis encoding &apos;utf&apos;</code> Exit psql by invoking <code>\q</code> Return to your session by invoking <code>exit</code> You now have a PostgreSQL user called <emphasis role="italic">dhis</emphasis> and a database called <emphasis role="italic">dhis2</emphasis>.</para>
-    <para>Do basic performance tuning by increasing the operating system kernel shared memory by opening file /etc/sysctl.conf and adding the line <emphasis role="italic">kernel.shmmax = 1073741824</emphasis> at the end of it. Make the change take effect by invoking <code>sysctl -p</code>  Then open  file <emphasis role="italic">/etc/postgresql/8.4/main/postgresql.conf</emphasis> and set the following properties: <code/><code>shared_buffers = 512MB </code><code>| effective_cache_size = 3750MB | checkpoint_segments = 15 | checkpoint_completion_target = 0.8 | wal_buffers = 4MB | synchronous_commit = off | wal_writer_delay = 10000ms </code></para>
+    <para>Do basic performance tuning by increasing the operating system kernel shared memory by opening file /etc/sysctl.conf and adding the line <emphasis role="italic">kernel.shmmax = 1073741824</emphasis> at the end of it. Make the change take effect by invoking <code>sysctl -p</code>  Then open  file <emphasis role="italic">/etc/postgresql/8.4/main/postgresql.conf</emphasis> and set the following properties: </para>
+    <para><code>shared_buffers = 512MB</code></para>
+    <para>Determines how much memory PostgreSQL can use for caching of query data. Is set too low by default since it depends on kernel shared memory which is low on some operating systems.</para>
+    <para><code>effective_cache_size = 3500MB</code></para>
+    <para>An estimate of how much memory is available for  caching (not an allocation) and is used by PostgreSQL to determine whether a query plan will fit into memory or not (setting it too high might result in unpredicted and slow behavior).</para>
+    <para><code>checkpoint_segments = 32</code></para>
+    <para>PostgreSQL writes new transactions to a log file called WAL segments which are 16MB in size. When a number of segments have been written a checkpoint occurs. Setting this number to a larger value will thus improve performance for write-heavy systems such as DHIS 2.</para>
+    <para><code>checkpoint_completion_target = 0.8</code></para>
+    <para>Determines the percentage of segment completion before a checkpoint occurs. Setting this to a high value will thus spread the writes out and lower the average write overhead.</para>
+    <para><code>wal_buffers = 4MB</code></para>
+    <para>Sets the memory used for buffering during the WAL write process. Increasing this value might improve throughput in write-heavy systems.</para>
+    <para><code>synchronous_commit = off</code></para>
+    <para>Specifies whether transaction commits will wait for WAL records to be written to the disk before returning to the client or not. Setting this to off will improve performance considerably. It also implies that there is a slight delay between the transaction is reported successful to the client and it actually being safe, but the database state cannot be corrupted and this is a good alternative for performance-intensive and write-heavy systems like DHIS 2.</para>
+    <para><code>wal_writer_delay = 10000ms</code></para>
+    <para>Speficies the delay between WAL write operations. Setting this to a high value will improve performance on write-heavy systems since potentially many write operations can be executed within a single flush to disk.</para>
     <para>Restart PostgreSQL by invoking <code>sudo /etc/init.d/postgresql restart</code></para>
     <para><emphasis role="bold">Set the database configuration</emphasis></para>
     <para>The database connection information is provided to DHIS 2 through a configuration file called <emphasis role="italic">hibernate.properties</emphasis>. Create this file and save it in a convenient location. A  file corresponding to the above setup has these properties: </para>
@@ -28,6 +42,7 @@
     <para>Open file <emphasis role="italic">bin/setclasspath.sh</emphasis> and add the lines below. The first will set the location of your Java Runtime Environment, the second will dedicate memory to Tomcat and the third will set the location for where DHIS 2 will search for the <emphasis role="italic">hibernate.properties</emphasis> configuration file, note that you should adjust this to your environment:</para>
     <para><code>JAVA_HOME=&apos;/usr/lib/jvm/java-6-sun&apos; | JAVA_OPTS=&apos;-Xmx6000m -XX:MaxPermSize=1000m&apos; | DHIS2_HOME=&apos;/home/dhis/config&apos;</code></para>
     <para>To do basic performance tuning you can install the native  <emphasis role="italic">APR</emphasis> library by invoking <code>sudo apt-get install libtcnative-1</code> Then open file<emphasis role="italic"> bin/setclasspath.sh</emphasis> and add this line at the end of the file: <emphasis role="italic">LD_LIBRARY_PATH=/usr/lib:$LD_LIBRARY_PATH</emphasis></para>
+    <para>If you need to change the <emphasis role="italic">port</emphasis> of which Tomcat listens for requests you can open the Tomcat configuration file <emphasis role="italic">/conf/server.xml</emphasis>, locate the <emphasis role="italic">&lt;Connector&gt;</emphasis> element which is not commented out and change the <emphasis role="italic">port</emphasis> attribute value to the desired port number.</para>
     <para><emphasis role="bold">Run DHIS 2</emphasis></para>
     <para>Make the startup script executable by invoking <code>chmod 755 bin/*</code> DHIS 2 can now be started by invoking <code>bin/startup.sh</code> The log can be monitored by invoking <code>tail -f logs/catalina.out</code> DHIS 2 can be stopped by invoking <code>bin/shutdown.sh</code></para>
   </section>
@@ -37,4 +52,12 @@
     <para>To install start by downloading DHIS 2 Live from <emphasis role="italic">http://dhis2.org</emphasis> and extract the archive to any location. On Windows click the executable archive. On Linux invoke the startup.sh script. After the startup process is done your default web browser will automtically be pointed to  <emphasis role="italic">http://localhost:8082</emphasis> where the application is accessible. A system tray menu is accessible on most operating systems where you can start and stop the server and start new browser sesssions. Please note that if you have the server running there is no need to start it again, simply open the application from the tray  menu.</para>
     <para>DHIS 2 Live is running on an embedded Jetty servlet container and an embedded H2 database. However it can easily be configured to run on other database systems such as PostgreSQL. Please read the section above about server installations for an explanation of the database configuration. The <emphasis role="italic">hibernate.properties</emphasis> configuration file is located in the <emphasis role="italic">conf</emphasis> folder. Remember to restart the Live package for your changes to take effect. The server port is 8082 by default. This can be changed by modifying the value in the<emphasis role="italic"> jetty.port</emphasis> configuration file located in the <emphasis role="italic">conf</emphasis> directory.</para>
   </section>
+  <section>
+    <title>Backup</title>
+    <para>Doing automated database backups for information systems in production is an absolute must, and might have uncomfortable consequences if ignored. Backups have two main purposes: The primary is data recovery in case data is lost, the secondary purpose is archiving of data for a historical period of time.</para>
+    <para>Backup should be central in a disaster recovery plan. Even though such a plan should cover additional subjects, the database is the key component to consider since this is where all data used in the DHIS 2 application is stored. Most other parts of the IT infrastructure surrounding the application can be restored based on standard components.</para>
+    <para>There are of course many ways to set up backup; however the following describes a setup where the database is copied into a dump file and saved on the file system. This can be considered a <emphasis role="italic">full</emphasis> backup. The backup is done with a <emphasis role="italic">cron job</emphasis>, which is a time-based scheduler in Unix/Linux operating systems.</para>
+    <remark>You can download both files from http://dhis2.com/download/pg_backup.zip</remark>
+    <para>The cron job is set up with two files. The first is a <emphasis role="italic">script</emphasis> which performs the actual task of backup up the database. It uses a PostgreSQL program called <emphasis role="italic">pg_dump</emphasis> for creating the database copy. The second is a crontab file which runs the backup script every day at 23:00.</para>
+  </section>
 </chapter>