← Back to team overview

dhis2-devs team mailing list archive

Unpivot with Postgres

 

Hi. Just wanted to share a bit of code with everyone, especially
report developers. It is quite common to need to "unpivot" data. For
instance, you may have a report table presented in one way, but you
might need to feed it to a report or Excel pivot table in normalized
format. Access/Microsoft SQL server has a very handy function for this
called "UNPIVOT", but AFAIK, there is not a corresponding function in
Postgresql. I have put together a few line of procedural SQL to
perform this purpose, albeit in not an generalized manner. Any
comments/suggestions would be welcome.

 I need to get some legacy  data into DHIS2, which first needs to be
first normalized, and then transformed into something DHIS2 can
understand (like DXF).

The source table looks like this.

area_code text,
  "year" double precision,
  quarter double precision,
  facility_type_code text,
  antenatal_first double precision,
  antenatal_reattendance double precision,
  pregnancies_tt double precision,
...

There are a lot more columns, each of which corresponds to a data
element. The columns need to be transformed to rows, with the column
name becoming an attribute, which will later be mapped to a
dataelementid.

I first created a normalized version of the table

CREATE TABLE hc_service_del_norm
(
  area_code text,
  "year" double precision,
  quarter double precision,
  facility_type_code text,
  attribute text,
  "value" text
)
WITH (OIDS=FALSE);
ALTER TABLE hc_service_del_norm OWNER TO postgres;

So, I will map the column names to attributes and place the
corresponding value in the "value" column.

Here is the function that does that.

-- Function: unpivot_hc_service_delivery()

-- DROP FUNCTION unpivot_hc_service_delivery();

CREATE OR REPLACE FUNCTION unpivot_hc_service_delivery()
  RETURNS integer AS
$BODY$
DECLARE rec record;
BEGIN
FOR  rec in

SELECT a.attname AS column_name, a.attname::text as this_attribute
FROM pg_class c, pg_attribute a, pg_type t
WHERE c.relname = 'hc_service_delivery'
    AND a.attnum > 0
    AND a.attrelid = c.oid
    AND a.atttypid = t.oid
    and a.attname NOT IN ('area_code','quarter'
,'facility_type_code','year', 'primkey')
LOOP

EXECUTE 'INSERT INTO hc_service_del_norm
(area_code,"year",quarter,facility_type_code, attribute, value)

 SELECT area_code, "year", quarter, facility_type_code,'''
	|| rec.this_attribute  || ''',' || rec.column_name ||
	' FROM hc_service_delivery';
RAISE NOTICE 'Column = %', rec.column_name;

END LOOP;
RETURN 1;
END;
  $BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100;
ALTER FUNCTION unpivot_hc_service_delivery() OWNER TO postgres;


Note that I choose to ignore certain columns like area_code (will map
to organisationalunit), quarter and year (will be mapped to period).
facility_type_code and primkey are not required, so I will remove them
from the list of columns to be transformed. The function will loop
through each table, slicing out a single column, placing the column
name in the "attribute" field, and the value in the "value" field.

The function is called with "SELECT unpivot_hc_service_delivery()".

It would be nice to have a general function to do this, but have not
progressed this far yet. Anyway, this could be modified pretty easily
to perform these "unpivot" types of operations from data that has
already been crosstabbed. Maybe someone will find it useful.

Regards,
Jason


---
Jason P. Pickering
email: jason.p.pickering@xxxxxxxxx
tel:+260968395190