oship-dev team mailing list archive
-
oship-dev team
-
Mailing list archive
-
Message #00293
Re: Automatic Python code generation [Re: adl2py fundamentals]
Hi, Tim:
For reference, I did all my tests in Linux Mint 7 Gloria (a distro
that is based on Ubuntu 9.04). I downloaded the latest version of
generateDS.py (1.17d) from sourceforge
(http://sourceforge.net/projects/generateds/) and installed it "the old
way" (sudo python setup.py install) without any problems.
Archetype.xsd depends on (i.e. <xs:include>) Resource.xsd and, in its
turn, Resource.xsd depends on BaseTypes.xsd. So I grouped these 3 files
into a single one (using the "process_includes.py" utility supplied by
generateDS.py, which depends on the "python-lxml" package) and I also
put the: <xs:element name="archetype" type="ARCHETYPE"/> line on the top
of the file, to ascertain that "archetype" would be correctly taken as
the root object (by the generateDS.py executable).
Let's recall that the XML version of openEHR (either 1.0.1 or 1.0.2)
is less thoroughly tested than the ADL one, so it's normal to find some
small typos and inconsistencies here and there. After a few minor
corrections to the .xsd files (BaseTypes.xsd is wrongly referenced as
"basetypes.xsd" on Resource.xsd; maxOccurs="unbounded" had to be added
to the "parent_resource" element), a huge (~400kB) "classes" file was
quickly generated (less than 1 sec.) by the program. This file already
includes 2 important utility functions: "parse" (that processes a .xml
file compatible with the .xsd schema, and return a "xml copy" of it) and
"parseLiteral" (that processes the same .xml file and return a "Python
literal" version of it -- see example below).
I called this output file "test.py" (not very creative, I know...). I
edited it to use the "parseLiteral" function (instead of the default
"parse") in its main() module and then I tried to parse our openEHR xml
files with it. After (again) some minor corrections, now on the .xml
files ("C_CODE_PHRASE" was replaced by "CODE_PHRASE"; "C_DV_QUANTITY"
was replaced by "DV_QUANTITY"), I finally got (e.g.) the following
output from the "openEHR-EHR-COMPOSITION.encounter.v1.xml" file:
-----snip-----
from test import *
rootObj = archetype(
original_language=model_.CODE_PHRASE(
terminology_id=model_.TERMINOLOGY_ID(
),
code_string='en',
),
is_controlled=None,
description=model_.RESOURCE_DESCRIPTION(
original_author=[
model_.original_author(
id = name,
valueOf_ = "Thomas Beale",
),
model_.original_author(
id = organisation,
valueOf_ = "Ocean Informatics",
),
model_.original_author(
id = date,
valueOf_ = "2005-10-10",
),
],
other_contributors=[
],
lifecycle_state='AuthorDraft',
resource_package_uri='None',
other_details=[
],
details=[
model_.details(
language=model_.CODE_PHRASE(
terminology_id=model_.TERMINOLOGY_ID(
),
code_string='en',
),
purpose='Record of encounter as a progress note.',
keywords=[
'progress',
'note',
'encounter',
],
use='',
misuse='',
copyright='None',
original_resource_uri=[
],
other_details=[
],
),
],
parent_resource=[
],
),
translations=[
],
archetype_id=model_.ARCHETYPE_ID(
),
adl_version='1.4',
concept='at0000',
definition=model_.C_COMPLEX_OBJECT(
valueOf_ = "",
rm_type_name='COMPOSITION',
occurrences=model_.IntervalOfInteger(
lower_included=True,
upper_included=True,
lower_unbounded=False,
upper_unbounded=False,
lower=1,
upper=1,
),
node_id='at0000',
valueOf_ = "",
attributes=[
model_.attributes(
),
],
),
invariants=[
],
ontology=model_.ARCHETYPE_ONTOLOGY(
term_definitions=[
model_.term_definitions(
language = en,
items=[
model_.items(
code = at0000,
items=[
model_.items(
id = description,
valueOf_ = "Generic encounter or
progress note composition",
),
model_.items(
id = text,
valueOf_ = "Encounter",
),
],
),
],
),
],
constraint_definitions=[
],
term_bindings=[
],
constraint_bindings=[
],
),
)
-----/snip-----
Please note that if one tries to "run" this code in Python, it will
complain that "valueOf_" is referenced twice, inside "definition". I
suppose that this is also a minor problem with the XML schema
("definition" is a C_COMPLEX_OBJECT, C_COMPLEX_OBJECT extends
C_DEFINED_OBJECT, C_DEFINED_OBJECT extends C_OBJECT etc). In any case,
the error messages given by generateDS.py during all these tests were
informative enough to help me finding these "minor" errors, and that
without any prior knowledge of the schema's details.
My opinion is that this approach deserves to be further investigated
(maybe on a new Blueprint?), at least as a way to cross-check the XML
against the ADL. The only downside I found was that unicode strings
(like the German definitions in
openEHR-EHR-OBSERVATION.blood_pressure.v1.xml) were not properly
handled, but maybe this is so because I am doing something wrong -- I am
still "discovering" the program and the schemas (schemata?).
Cheers,
Roberto.
Tim Cook a écrit :
Hi Roberto,
If you have time could you please install this app and then run it
against the blood pressure XML files[1] and then send me the Python
output so I can compare it to the ADL?
I used generateDS several years ago without much success.
[1] The schemas are here:
http://www.openehr.org/releases/1.0.2/its/XML-schema/index.html
The archetypes are:
openEHR-EHR-CLUSTER.device.v1.adl
openEHR-EHR-CLUSTER.level_of_exertion.v1.adl
openEHR-EHR-COMPOSITION.encounter.v1.adl
openEHR-EHR-OBSERVATION.blood_pressure.v1.adl
their XML representations can be found in their categories at:
http://www.openehr.org/svn/knowledge/archetypes/dev-uk-nhs/gen/xml/openehr/ehr/
Thanks,
Tim
On Mon, 2009-07-06 at 15:13 +0200, Roberto Siqueira wrote:
Hi, all:
I was checking the state-of-the-art of the OpenEHR XML representation
(http://www.openehr.org/releases/1.0.1/its/XML-schema/index.html) and
also reviewing the different XML modules available in Python to
represent data (DOM, objectify, Elementtree, lxml etc) when I found
this: http://www.rexx.com/~dkuhlman/generateDS.html -- a module that
generates Python classes from XML schemas (XSD files). It's not exactly
what we were looking for back then in April, but may be useful anyway.
Please have a look at it when you have some time.
Best regards,
Roberto.
Le 23.04.2009 21:55, Roberto Siqueira a écrit :
[...] By the way: generation of Python code using Python itself is
what is called metaprogramming
(http://en.wikipedia.org/wiki/Metaprogramming). It would be wonderful
to find some sort of Python "metaprogrammer" ("disassembler" or
"decompiler") library ready to use, don't you think? Unfortunately,
the only ones I've found (up to now) are "low level" bytecode
decompilers like: http://docs.python.org/library/dis.html , that are
not capable to decompile "high level" objects like classes, mixins
etc. In any case, I suppose that the small "helper class" described
in: http://effbot.org/zone/python-code-generator.htm will have some
utility here, as handling indentation can be very cumbersome,
sometimes. [...]
Follow ups
References