dhis2-devs team mailing list archive
-
dhis2-devs team
-
Mailing list archive
-
Message #12723
[Branch ~dhis2-documenters/dhis2/dhis2-docbook-docs] Rev 361: Improved data warehouse chapter
------------------------------------------------------------
revno: 361
committer: Lars Helge Overland <larshelge@xxxxxxxxx>
branch nick: dhis2-docbook-docs
timestamp: Tue 2011-06-21 23:38:28 +0200
message:
Improved data warehouse chapter
added:
src/docbkx/en/resources/images/implementation_guide/data_warehouse.png
src/docbkx/en/resources/images/implementation_guide/dhis_data_warehouse.png
src/docbkx/en/resources/images/implementation_guide/dimensional_approach.png
modified:
src/docbkx/en/dhis2_implementation_guide_data_warehouse.xml
--
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs
https://code.launchpad.net/~dhis2-documenters/dhis2/dhis2-docbook-docs
Your team DHIS 2 developers is subscribed to branch lp:~dhis2-documenters/dhis2/dhis2-docbook-docs.
To unsubscribe from this branch go to https://code.launchpad.net/~dhis2-documenters/dhis2/dhis2-docbook-docs/+edit-subscription
=== modified file 'src/docbkx/en/dhis2_implementation_guide_data_warehouse.xml'
--- src/docbkx/en/dhis2_implementation_guide_data_warehouse.xml 2011-06-18 20:20:19 +0000
+++ src/docbkx/en/dhis2_implementation_guide_data_warehouse.xml 2011-06-21 21:38:28 +0000
@@ -1,6 +1,5 @@
<?xml version='1.0' encoding='UTF-8'?>
-<!-- This document was created with Syntext Serna Free. -->
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" []>
+<!-- This document was created with Syntext Serna Free. --><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" []>
<chapter>
<title>DHIS 2 as Data Warehouse</title>
<para>This chapter will discuss the role and place of the DHIS 2 application in a system architecture context. It will show that DHIS 2 can serve the purpose of both a data warehouse and an operational system.</para>
@@ -8,6 +7,7 @@
<title>Data warehouses and operational systems</title>
<para>A <emphasis role="italic">data warehouse</emphasis> is commonly understood as a database used for analysis. Typically data is uploaded from various operational / transactional systems. Before data is loaded into the data warehouse it usually goes through various stages where it is cleaned for anomalies and redundancy and transformed to conform with the overall structure of the integrated database. Data is then made available for use by analysis, also known under terms such as<emphasis role="italic"> data mining </emphasis>and <emphasis role="italic">online analytical processing</emphasis>. The data warehouse design is optimized for speed of data retrieval and analysis. To improve performance the data storage is often redundant in the sense that the data is stored both in its most granular form and in an aggregated (summarized) form.</para>
<para>A <emphasis role="italic">transactional system</emphasis> (or <emphasis role="italic">operational system</emphasis> from a data warehouse perspective) is a system that collects, stores and modifies low level data. This system is typically used on a day-to-day basis for data entry and validation. The design is optimized for fast insert and update performance.</para>
+ <graphic fileref="resources/images/implementation_guide/data_warehouse.png" width="80%" format="PNG" align="center"/>
<para>There are several benefits of maintaining a data warehouse, some of them being:</para>
<itemizedlist>
<listitem>
@@ -30,7 +30,20 @@
</listitem>
</itemizedlist>
<para>Due to the mentioned challenges it has lately become increasingly popular to merge the functions of the data warehouse and operational system, either into a single system which performs both tasks or with tightly integrated systems hosted together. With this approach the system provides functionality for data capture and validation as well as data analysis and manages the process of converting low-level atomic data into aggregate data suitable for analysis. This sets high standards for the system and its design as it must provide appropriate performance for both of those functions; however advances in hardware and parallel processing is increasingly making such an approach feasible.</para>
- <para>In this regard, the DHIS 2 application is designed to serve as a tool for both data capture, validation, analysis and presentation of data. It provides modules for all of the mentioned aspects, including data entry functionality and a wide array of analysis tools such as reports, charts, maps, pivot tables and dashboard. </para>
+ <para>In this regard, the DHIS 2 application is designed to serve as a tool for both data capture, validation, analysis and presentation of data. It provides modules for all of the mentioned aspects, including data entry functionality and a wide array of analysis tools such as reports, charts, maps, pivot tables and dashboard.</para>
+ <para>In addition, DHIS 2 is a part of a suite of interoperable health information systems which covers a wide range of needs and are all open-source software. DHIS 2 implements the standard for data and meta-data exhange in the health domain called SDMX-HD. There are many examples of operational systems which also implements this standard and potenitally can feed data into DHIS 2:</para>
+ <itemizedlist>
+ <listitem>
+ <para>iHRIS: System for management of human resource data. Examples of data which is relevant for a national data warehouse captured by this system is "number of doctors", "number of nurses" and "total number of staff". This data is interesting to compare for instance to district performance.</para>
+ </listitem>
+ <listitem>
+ <para>OpenMRS: Medical record system being used at hospital. This system can potentially aggregate and export data on inpatient diseases to a national data warehouse.</para>
+ </listitem>
+ <listitem>
+ <para>OpenELIS: Laboratory enterprise information system. This system can generate and export data on number and outcome of laboratory tests.</para>
+ </listitem>
+ </itemizedlist>
+ <graphic fileref="resources/images/implementation_guide/dhis_data_warehouse.png" format="PNG" width="80%" align="center"/>
</section>
<section>
<title>Aggregation strategies in DHIS 2</title>
@@ -43,5 +56,6 @@
<para>There are two leading approaches for storing data in a data warehouse, namely the <emphasis role="italic">normalized</emphasis> and <emphasis role="italic">dimensional</emphasis> approach. DHIS 2 lends a bit from the former but mostly from the latter. In the dimensional approach the data is partitioned into <emphasis role="italic">dimensions</emphasis> and <emphasis role="italic">facts</emphasis>. Facts generally refers to transactional numeric data while dimensions are the reference data that gives context and meaning to the data. The strict rules of this approach makes it easy for users to understand the data warehouse structure and provides for good performance since few tables must be combined to produce meaningful analysis, while it on the other hand might make the system less flexible and harder to change.</para>
<para>
In DHIS the facts corresponds to the data value object in the data model. The data value captures data as numbers, yes/no or text. The <emphasis role="italic">compulsory dimensions</emphasis> which give meaning to the facts are the <emphasis role="italic">data element</emphasis>, <emphasis role="italic">organisation unit hierarchy</emphasis> and <emphasis role="italic">period</emphasis> dimensions. These dimensions are referred to as compulsory since they must be provided for all stored data records. DHIS 2 also has a custom dimensional model which makes it possible to represent any kind of dimensionality. This model must be defined prior to data capture. DHIS 2 also has a flexible model of groups and group sets which makes it possible to add custom dimensionality to the compulsory dimensions after data capture has taken place. You can read more about dimensionality in DHIS 2 in the chapter by the same name.</para>
+ <graphic fileref="resources/images/implementation_guide/dimensional_approach.png" width="80%" format="PNG" align="center"/>
</section>
</chapter>
=== added file 'src/docbkx/en/resources/images/implementation_guide/data_warehouse.png'
Binary files src/docbkx/en/resources/images/implementation_guide/data_warehouse.png 1970-01-01 00:00:00 +0000 and src/docbkx/en/resources/images/implementation_guide/data_warehouse.png 2011-06-21 21:38:28 +0000 differ
=== added file 'src/docbkx/en/resources/images/implementation_guide/dhis_data_warehouse.png'
Binary files src/docbkx/en/resources/images/implementation_guide/dhis_data_warehouse.png 1970-01-01 00:00:00 +0000 and src/docbkx/en/resources/images/implementation_guide/dhis_data_warehouse.png 2011-06-21 21:38:28 +0000 differ
=== added file 'src/docbkx/en/resources/images/implementation_guide/dimensional_approach.png'
Binary files src/docbkx/en/resources/images/implementation_guide/dimensional_approach.png 1970-01-01 00:00:00 +0000 and src/docbkx/en/resources/images/implementation_guide/dimensional_approach.png 2011-06-21 21:38:28 +0000 differ