mlhim-specs-dev team mailing list archive
-
mlhim-specs-dev team
-
Mailing list archive
-
Message #00743
[Branch ~cdd-dev/cdd/trunk] Rev 263: Added tool to upload CCDs to HKCR. Working on creating CareEntry CCDs in xls2ccd.py
------------------------------------------------------------
revno: 263
committer: Timothy W. Cook <timothywayne.cook@xxxxxxxxx>
branch nick: cdd
timestamp: Sun 2012-09-23 08:57:38 -0300
message:
Added tool to upload CCDs to HKCR. Working on creating CareEntry CCDs in xls2ccd.py
removed:
src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls
modified:
src/xls2ccd/README.txt
src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls
src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls
src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls
src/xls2ccd/mlhim2RM.py
src/xls2ccd/xls2ccd.py
templates/CDD-2.4.0.xmt
--
lp:cdd
https://code.launchpad.net/~cdd-dev/cdd/trunk
Your team MLHIM Specifications Developers is subscribed to branch lp:cdd.
To unsubscribe from this branch go to https://code.launchpad.net/~cdd-dev/cdd/trunk/+edit-subscription
=== modified file 'src/xls2ccd/README.txt'
--- src/xls2ccd/README.txt 2012-09-14 12:44:18 +0000
+++ src/xls2ccd/README.txt 2012-09-23 11:57:38 +0000
@@ -1,27 +1,63 @@
xls2ccd.py
+
+********* Please read all of this short document *********
+
+
REQUIRES: Python 2.6/2.7 and xlrd
-
-This utility is used to create MLHIM CCDs from standard template, XLS downloads from the NCI CDE.
-https://cdebrowser.nci.nih.gov/CDEBrowser/
-
-Some pre-processing is required.
-
-Download a set of CDEs in .xls format using the "Available Downloads" link near the top-right of the page.
+Once you install Python.
+Download http://pypi.python.org/pypi/xlrd
+Extract the archive and change to that directory.
+Execute: python setup.py install
+
+If you have problems installing xlrd, this may help.
+https://groups.google.com/forum/?fromgroups=#!topic/python-excel/jm536Kt7v90
+
+======================================= The Stuff You Came For =======================================
+
+xls2ccd.py is used to create simple MLHIM CCDs from standard .xls template. It is based on XLS downloads from the NCI CDE:
+https://cdebrowser.nci.nih.gov/CDEBrowser/ and has since been modified and expanded.
+
+See the 'Information' sheet in the template.
+Do not be concerned with the numbers in row 2. They are used for development purposes.
+
+The NCI CDE inforamtion can still be used by:
+ Download a set of CDEs in .xls format using the "Available Downloads" link near the top-right of the page.
Select the set you want to convert from the links available on the "caDSR CDE Downloads" page.
Be sure that the filename contains "NCI_Standard_Template (MM)_(DD)_2012". Other formats have not been tested. Please let us know which dates of the Standard Template works for you.
Open the spreadsheet.
-Delete rows 1 - 10.
-Delete columns A & B.
-
-Save (in .xls format) the modified spreadsheet into the same directory as this utility.
+Copy from row 11 down and column C and across to column M, from the NCI template. Then paste this into a copy of the
+MLHIM_Std_Template-xls2ccd.xls from row 2 down and columns J - T.
+
+Save (in .xls format); the modified spreadsheet.
+
+Now change the Datatype column to reflect your desired MLHIM datatype CCD. Again these are simple CCDs. I will show you later
+how to use these in the context of creating larger, more common concept definitions.
Execute the tool with this commandline:
-python xls2ccd.py <filename>
-
-All CCDs created that are not of the CHARACTER or ALPHANUMERIC datatype will be flagged with an "R" as the first character of their filename.
-These must be reviewed manually before use. When all corrections are made, remove the "R" from the CCD element name and then save the file w/o the "R".
+python [path/to/]xls2ccd.py <filename>
+
+replace the [path/to/] with the path to where you extracted xls2ccd.py
+and the filename with the name you used to save your spreadsheet.
+
+========= EXAMPLES ===========
+
+There is a folder labeled TestTemplate. This is a test template used in developement of the tool.
+You can generate these CCDs to see how the generated schemas compare with the spreadsheet data.
+
+In the Demography folder there is a copy of the Demography_NCI_Standard_Template 06_29_2012.xls file that has been renamed
+to original-Demography_NCI_Standard_Template 06_29_2012.xls. The area to be copied is highlighted with a yellow background.
+The file MLHIM_Std_Template-NCI-Demographics.xls shows the results of pasting the original data into the MLHIM template and
+changing the dataype column 'O' to MLHIM datatypes.
+
+Direct all questions to:
+Preferred: https://launchpad.net/~mlhim-owners mailing list.
+
+Tim Cook timothywayne.cook@xxxxxxxxx
+Dr. Luciana Cavalini lutricav@xxxxxxxxxxxxxxx
+NOTE: all questions directly to Tim or Luciana will be CC'd in reply to the mailing list
+in order to inform and promote project documentation.
=== removed file 'src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls'
Binary files src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls 2012-09-15 22:10:45 +0000 and src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls 1970-01-01 00:00:00 +0000 differ
=== modified file 'src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls'
Binary files src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls 2012-09-15 22:10:45 +0000 and src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls 2012-09-23 11:57:38 +0000 differ
=== modified file 'src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls'
Binary files src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls 2012-09-18 17:09:07 +0000 and src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls 2012-09-23 11:57:38 +0000 differ
=== modified file 'src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls'
Binary files src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls 2012-09-18 17:09:07 +0000 and src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls 2012-09-23 11:57:38 +0000 differ
=== modified file 'src/xls2ccd/mlhim2RM.py'
--- src/xls2ccd/mlhim2RM.py 2012-09-20 00:38:24 +0000
+++ src/xls2ccd/mlhim2RM.py 2012-09-23 11:57:38 +0000
@@ -196,13 +196,19 @@
#====================================================================
-def getCareEntryType(data_name, ct_name, docs, d_id, e_data_type, indent=0):
+def getCareEntryType(data_name, ct_name, docs, dt_id, e_data_type, indent=0):
"""
data_name - string to use for data_name
ct_name - uuid string for complexType.name
docs - string for documentation
+ dt_id - the uuid for the datatype name
+ e_data_type - ClusterType, ElementType or SlotType - in xls2ccd this is only ClusterType
indent - number of spaces to indent first line
"""
+
+ e_subject_type = "PartySelfType" #TODO for CDD add options for PartyIdentifiedType
+ es_id = str(uuid.uuid4()) #entry-subject id
+ ed_id = str(uuid.uuid4()) #entry-data id
defin_str = ''
padding = ('').rjust(indent)
@@ -216,21 +222,39 @@
#Entry
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' name='language' type='xs:language' default='en-US'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' name='encoding' type='xs:string' default='utf-8'/>\n")
- defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:entry-subject'/>\n")
+ defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:el-"+es_id+"'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:entry-provider'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='unbounded' minOccurs='0' ref='mlhim2:other-participations'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:protocol-id'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' name='current-state' type='xs:string'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:workflow-id'/>\n")
defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:attestation'/>\n")
- defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:entry-data'/>\n")
+ defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:el-"+ed_id+"'/>\n")
defin_str += padding.rjust(indent+6) + ("</xs:sequence>\n")
defin_str += padding.rjust(indent+4) + ("</xs:restriction>\n")
defin_str += padding.rjust(indent+2) + ("</xs:complexContent>\n")
defin_str += padding.rjust(indent) + ("</xs:complexType>\n")
-
- return defin_str
+ dt_str += padding.rjust(indent+2) + ("<xs:element name='el-"+ed_id+"' substitutionGroup='mlhim2:entry-data' type='mlhim2:ct-"+ed_id+"'/>\n")
+ dt_str += padding.rjust(indent+2) + ("<xs:element name='el-"+es_id+"' substitutionGroup='mlhim2:entry-subject' type='mlhim2:ct-"+es_id+"'/>\n")
+
+ if e_subject_type == "PartyIdentifiedType":
+ pass #TODO for CDD
+ elif e_data_type == "PartySelfType":
+ defin_str += getPartSelfType(indent)
+ else:
+ raise TypeError(e_subject_type + ": is not a valid CareEntry.entry-subject type.")
+
+ if e_data_type == "ElementType":
+ pass #TODO for CDD
+ elif e_data_type == "SlotType":
+ pass #TODO for CDD
+ elif e_data_type == "ClusterType":
+ defin_str += getClusterType(data_name, ed_id, dt_id, indent)
+ else:
+ raise TypeError(e_data_type + ": is not a valid CCD.definition type.")
+
+ return(defin_str)
def getDvBooleanType(data_name, ct_name, bool_values, docs, indent=0):
@@ -715,10 +739,10 @@
return dt_str
-def getElementType(e_id, d_id, indent):
+def getElementType(e_id, dt_id, indent):
defin_str = ''
padding = ('').rjust(indent)
- defin_str += padding.rjust(indent) + ("<xs:complexType name='ct-"+e_id+"'>\n")
+ defin_str += '\n\n'+padding.rjust(indent) + ("<xs:complexType name='ct-"+e_id+"'>\n")
defin_str += padding.rjust(indent+2) + ("<xs:complexContent>\n")
defin_str += padding.rjust(indent+4) + ("<xs:restriction base='mlhim2:ElementType'>\n")
defin_str += padding.rjust(indent+6) + ("<xs:sequence>\n")
@@ -727,7 +751,7 @@
defin_str += padding.rjust(indent+4) + ("</xs:restriction>\n")
defin_str += padding.rjust(indent+2) + ("</xs:complexContent>\n")
defin_str += padding.rjust(indent) + ("</xs:complexType>\n")
- defin_str += padding.rjust(indent) + ("<xs:element name='el-"+d_id+"' substitutionGroup='mlhim2:Element-dv' type='mlhim2:ct-"+d_id+"'/>\n")
+ defin_str += padding.rjust(indent) + ("<xs:element name='el-"+dt_id+"' substitutionGroup='mlhim2:Element-dv' type='mlhim2:ct-"+dt_id+"'/>\n")
return defin_str
=== modified file 'src/xls2ccd/xls2ccd.py'
--- src/xls2ccd/xls2ccd.py 2012-09-20 00:38:24 +0000
+++ src/xls2ccd/xls2ccd.py 2012-09-23 11:57:38 +0000
@@ -52,6 +52,8 @@
print 'Rows: ', s.nrows
bool_vals = []
+defin_type = None
+
for row in range(1,s.nrows):
values = []
@@ -59,8 +61,8 @@
values.append(s.cell(row,col).value)
if values[9]: #title, must be unique
currkey = values[9]
-
- ccd_dict[currkey] = [values[0],values[1],values[2],values[3],values[4],values[5],values[6],values[7],values[8],values[9],values[10],values[11],values[12],values[13],values[14],values[15],values[16],[],[]] # New Entry values
+ print values[9], values[27]
+ ccd_dict[currkey] = [values[0],values[1],values[2],values[3],values[4],values[5],values[6],values[7],values[8],values[9],values[10],values[11],values[12],values[13],values[14],values[15],values[16],[],[],[]] # New Entry values
else:
if values[17]: #DvString Enumerations
ccd_dict[currkey][17].append((values[17],values[18],values[19]))
@@ -70,6 +72,8 @@
elif values[25] != '': #DvBoolean true/false values
bool_vals.append((values[25],values[26]))
+ if values[27]:
+ ccd_dict[currkey][19].append(values[27])
for k in ccd_dict.keys():
size = len(ccd_dict[k])
@@ -105,23 +109,28 @@
if not ccd_dict[k][16]:
ccd_dict[k][16] = " Not Defined"
- #MLHIM ElementType
- e_id = str(uuid.uuid4())
+ #MLHIM CCD.definition type
+ defin_id = str(uuid.uuid4())
#MLHIM DataType
- d_id = str(uuid.uuid4())
-
- defin_type = "ElementType"
+ dt_id = str(uuid.uuid4())
+
+ if ccd_dict[k][19]:
+ defin_type = ccd_dict[k][19]
+ else:
+ defin_type = ["ElementType"]
+
defin_str = ''
# if the defin_type is an Entry type this is the entry-data type
+ # in the CDD this could also be an ElementType or ClusterType
e_data_type = "ClusterType"
schema = ccd_id + ".xsd"
#/ccd setup
title = ccd_dict[k][9]
- print ""
+ print defin_type
print "Generating: "+ title +" -- CCD ID = " + ccd_id
print ccd_dict[k][14]
ccd_catalog.write('{0:.<40}'.format(title) + ccd_id +'\n')
@@ -200,20 +209,20 @@
<xs:complexContent>
<xs:restriction base="mlhim2:CCDType">
<xs:sequence>
- <xs:element maxOccurs="1" minOccurs="1" ref="mlhim2:el-"""+e_id+""""/>
+ <xs:element maxOccurs="1" minOccurs="1" ref="mlhim2:el-"""+defin_id+""""/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
- <xs:element name='el-"""+e_id+"""' substitutionGroup="mlhim2:definition" type='mlhim2:ct-"""+e_id+"""'/>""")
-
-
- if defin_type == "ElementType":
- indent = 4
- xsd_file.write(mlhim2RM.getElementType(e_id, d_id, indent))
- elif defin_type == "CareEntryType":
- indent = 4
- xsd_file.write(mlhim2RM.getCareEntryType(data_name, e_id, docs, d_id, e_data_type, indent))
+ <xs:element name='el-"""+defin_id+"""' substitutionGroup="mlhim2:definition" type='mlhim2:ct-"""+defin_id+"""'/>""")
+
+
+ if defin_type[0] == "ElementType":
+ indent = 4
+ xsd_file.write(mlhim2RM.getElementType(defin_id, dt_id, indent))
+ elif defin_type[0] == "CareEntryType":
+ indent = 4
+ xsd_file.write(mlhim2RM.getCareEntryType(data_name, defin_id, dt_docs, dt_id, e_data_type, indent))
if dt == "DvBooleanType":
=== modified file 'templates/CDD-2.4.0.xmt'
Binary files templates/CDD-2.4.0.xmt 2012-09-14 12:44:18 +0000 and templates/CDD-2.4.0.xmt 2012-09-23 11:57:38 +0000 differ