mlhim-specs-dev team mailing list archive

Thread
Date

[Branch ~cdd-dev/cdd/trunk] Rev 263: Added tool to upload CCDs to HKCR. Working on creating CareEntry CCDs in xls2ccd.py

To: MLHIM Specifications Developers <mlhim-specs-dev@xxxxxxxxxxxxxxxxxxx>
From: noreply@xxxxxxxxxxxxx
Date: Sun, 23 Sep 2012 11:58:13 -0000
Reply-to: noreply@xxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

------------------------------------------------------------
revno: 263
committer: Timothy W. Cook <timothywayne.cook@xxxxxxxxx>
branch nick: cdd
timestamp: Sun 2012-09-23 08:57:38 -0300
message:
  Added tool to upload CCDs to HKCR. Working on creating CareEntry CCDs in xls2ccd.py
removed:
  src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls
modified:
  src/xls2ccd/README.txt
  src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls
  src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls
  src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls
  src/xls2ccd/mlhim2RM.py
  src/xls2ccd/xls2ccd.py
  templates/CDD-2.4.0.xmt


--
lp:cdd
https://code.launchpad.net/~cdd-dev/cdd/trunk

Your team MLHIM Specifications Developers is subscribed to branch lp:cdd.
To unsubscribe from this branch go to https://code.launchpad.net/~cdd-dev/cdd/trunk/+edit-subscription

=== modified file 'src/xls2ccd/README.txt'
--- src/xls2ccd/README.txt	2012-09-14 12:44:18 +0000
+++ src/xls2ccd/README.txt	2012-09-23 11:57:38 +0000
@@ -1,27 +1,63 @@
 xls2ccd.py
+
+********* Please read all of this short document *********
+
+
 REQUIRES: Python 2.6/2.7 and xlrd  
-
-This utility is used to create MLHIM CCDs  from standard template, XLS downloads from the NCI CDE.
-https://cdebrowser.nci.nih.gov/CDEBrowser/  
-
-Some pre-processing is required. 
-
-Download a set of CDEs in .xls format using the "Available Downloads" link near the top-right of the page. 
+Once you install Python.
+Download http://pypi.python.org/pypi/xlrd 
+Extract the archive and change to that directory.
+Execute: python setup.py install
+
+If you have problems installing xlrd, this may help.
+https://groups.google.com/forum/?fromgroups=#!topic/python-excel/jm536Kt7v90
+
+======================================= The Stuff You Came For  =======================================
+
+xls2ccd.py  is used to create simple MLHIM CCDs  from standard .xls template. It is based on XLS downloads from the NCI CDE:
+https://cdebrowser.nci.nih.gov/CDEBrowser/  and has since been modified and expanded.
+
+See the 'Information' sheet in the template.  
+Do not be concerned with the numbers in row 2.  They are used for development purposes.
+
+The NCI  CDE inforamtion can still be used by:
+ Download a set of CDEs in .xls format using the "Available Downloads" link near the top-right of the page. 
 Select the set you want to convert from the links available on the "caDSR CDE Downloads" page.
 
 Be sure that the filename contains "NCI_Standard_Template (MM)_(DD)_2012".  Other formats have not been tested.  Please let us know which dates of the Standard Template works for you.
 
 Open the spreadsheet.  
-Delete rows 1 - 10.
-Delete columns A & B.
-
-Save (in .xls format) the modified spreadsheet into the same directory as this utility. 
+Copy from row 11 down and column C and across to column M, from the NCI template.  Then paste this into a copy of the 
+MLHIM_Std_Template-xls2ccd.xls from row 2 down and columns J - T.   
+
+Save (in .xls format); the modified spreadsheet. 
+
+Now change the Datatype column to reflect your desired MLHIM datatype CCD.   Again these are simple CCDs.  I will show you later
+how to use these in the context of creating larger, more common concept definitions. 
 
 Execute the tool with this commandline:
-python xls2ccd.py <filename> 
-
-All CCDs created that are not of the CHARACTER or ALPHANUMERIC datatype will be flagged with an "R" as the first character of their filename.  
-These must be reviewed manually before use.  When all corrections are made, remove the "R" from the CCD element name and then save the file w/o the "R".
+python [path/to/]xls2ccd.py <filename>
+
+replace the [path/to/] with the path to where you extracted xls2ccd.py
+and the filename with the name you used to save your spreadsheet. 
+
+=========  EXAMPLES ===========
+
+There is a folder labeled TestTemplate.  This is a test template used in developement of the tool.  
+You can generate these CCDs to see how the generated schemas compare with the spreadsheet data. 
+
+In the Demography folder there is a copy of the Demography_NCI_Standard_Template 06_29_2012.xls file that has been renamed
+to original-Demography_NCI_Standard_Template 06_29_2012.xls.  The area to be copied is highlighted with a yellow background.
+The file MLHIM_Std_Template-NCI-Demographics.xls shows the results of pasting the original data into the MLHIM template and
+changing the dataype column 'O' to MLHIM datatypes.  
+
+Direct all questions to: 
+Preferred: https://launchpad.net/~mlhim-owners mailing list.
+
+Tim Cook timothywayne.cook@xxxxxxxxx
+Dr. Luciana Cavalini  lutricav@xxxxxxxxxxxxxxx
+NOTE: all questions directly to Tim or Luciana will be CC'd in reply to the mailing list 
+in order to inform and promote project documentation.  
 
 
 

=== removed file 'src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls'
Binary files src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls	2012-09-15 22:10:45 +0000 and src/xls2ccd/examples/Demography/Demography_NCI_Standard_Template 06_29_2012.xls	1970-01-01 00:00:00 +0000 differ
=== modified file 'src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls'
Binary files src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls	2012-09-15 22:10:45 +0000 and src/xls2ccd/examples/Demography/MLHIM_Std_Template-NCI-Demographics.xls	2012-09-23 11:57:38 +0000 differ
=== modified file 'src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls'
Binary files src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls	2012-09-18 17:09:07 +0000 and src/xls2ccd/examples/MLHIM_Std_Template/MLHIM_Std_Template-xls2ccd.xls	2012-09-23 11:57:38 +0000 differ
=== modified file 'src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls'
Binary files src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls	2012-09-18 17:09:07 +0000 and src/xls2ccd/examples/TestTemplate/MLHIM_Std_Template-Tests.xls	2012-09-23 11:57:38 +0000 differ
=== modified file 'src/xls2ccd/mlhim2RM.py'
--- src/xls2ccd/mlhim2RM.py	2012-09-20 00:38:24 +0000
+++ src/xls2ccd/mlhim2RM.py	2012-09-23 11:57:38 +0000
@@ -196,13 +196,19 @@
 #====================================================================
 
 
-def getCareEntryType(data_name, ct_name, docs, d_id, e_data_type, indent=0):
+def getCareEntryType(data_name, ct_name, docs, dt_id, e_data_type, indent=0):
     """
     data_name - string to use for data_name
     ct_name - uuid string for complexType.name
     docs - string for documentation
+    dt_id - the uuid for the datatype name
+    e_data_type - ClusterType, ElementType or SlotType - in xls2ccd this is only ClusterType
     indent - number of spaces to indent first line
     """
+
+    e_subject_type = "PartySelfType" #TODO for CDD add options for PartyIdentifiedType
+    es_id = str(uuid.uuid4()) #entry-subject id
+    ed_id = str(uuid.uuid4()) #entry-data id
     defin_str = ''
     padding = ('').rjust(indent)
 
@@ -216,21 +222,39 @@
     #Entry
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' name='language' type='xs:language' default='en-US'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' name='encoding' type='xs:string' default='utf-8'/>\n")
-    defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:entry-subject'/>\n")
+    defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:el-"+es_id+"'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:entry-provider'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='unbounded' minOccurs='0' ref='mlhim2:other-participations'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:protocol-id'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' name='current-state' type='xs:string'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:workflow-id'/>\n")
     defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='0' ref='mlhim2:attestation'/>\n")
-    defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:entry-data'/>\n")
+    defin_str += padding.rjust(indent+8) + ("<xs:element maxOccurs='1' minOccurs='1' ref='mlhim2:el-"+ed_id+"'/>\n")
 
     defin_str += padding.rjust(indent+6) + ("</xs:sequence>\n")
     defin_str += padding.rjust(indent+4) + ("</xs:restriction>\n")
     defin_str += padding.rjust(indent+2) + ("</xs:complexContent>\n")
     defin_str += padding.rjust(indent) + ("</xs:complexType>\n")
-
-    return defin_str
+    dt_str += padding.rjust(indent+2) + ("<xs:element name='el-"+ed_id+"' substitutionGroup='mlhim2:entry-data' type='mlhim2:ct-"+ed_id+"'/>\n")
+    dt_str += padding.rjust(indent+2) + ("<xs:element name='el-"+es_id+"' substitutionGroup='mlhim2:entry-subject' type='mlhim2:ct-"+es_id+"'/>\n")
+
+    if e_subject_type == "PartyIdentifiedType":
+        pass #TODO for CDD
+    elif e_data_type == "PartySelfType":
+        defin_str += getPartSelfType(indent)
+    else:
+        raise TypeError(e_subject_type + ": is not a valid CareEntry.entry-subject type.")
+
+    if e_data_type == "ElementType":
+        pass #TODO for CDD
+    elif e_data_type == "SlotType":
+        pass #TODO for CDD
+    elif e_data_type == "ClusterType":
+        defin_str += getClusterType(data_name, ed_id, dt_id, indent)
+    else:
+        raise TypeError(e_data_type + ": is not a valid CCD.definition type.")
+
+    return(defin_str)
 
 
 def getDvBooleanType(data_name, ct_name, bool_values, docs, indent=0):
@@ -715,10 +739,10 @@
 
     return dt_str
 
-def getElementType(e_id, d_id, indent):
+def getElementType(e_id, dt_id, indent):
     defin_str = ''
     padding = ('').rjust(indent)
-    defin_str += padding.rjust(indent) + ("<xs:complexType name='ct-"+e_id+"'>\n")
+    defin_str += '\n\n'+padding.rjust(indent) + ("<xs:complexType name='ct-"+e_id+"'>\n")
     defin_str += padding.rjust(indent+2) + ("<xs:complexContent>\n")
     defin_str += padding.rjust(indent+4) + ("<xs:restriction base='mlhim2:ElementType'>\n")
     defin_str += padding.rjust(indent+6) + ("<xs:sequence>\n")
@@ -727,7 +751,7 @@
     defin_str += padding.rjust(indent+4) + ("</xs:restriction>\n")
     defin_str += padding.rjust(indent+2) + ("</xs:complexContent>\n")
     defin_str += padding.rjust(indent) + ("</xs:complexType>\n")
-    defin_str += padding.rjust(indent) + ("<xs:element name='el-"+d_id+"' substitutionGroup='mlhim2:Element-dv' type='mlhim2:ct-"+d_id+"'/>\n")
+    defin_str += padding.rjust(indent) + ("<xs:element name='el-"+dt_id+"' substitutionGroup='mlhim2:Element-dv' type='mlhim2:ct-"+dt_id+"'/>\n")
 
     return defin_str
 

=== modified file 'src/xls2ccd/xls2ccd.py'
--- src/xls2ccd/xls2ccd.py	2012-09-20 00:38:24 +0000
+++ src/xls2ccd/xls2ccd.py	2012-09-23 11:57:38 +0000
@@ -52,6 +52,8 @@
 print 'Rows: ', s.nrows
 
 bool_vals = []
+defin_type = None
+
 
 for row in range(1,s.nrows):
     values = []
@@ -59,8 +61,8 @@
         values.append(s.cell(row,col).value)
     if values[9]:  #title, must be unique
         currkey = values[9]
-
-        ccd_dict[currkey] = [values[0],values[1],values[2],values[3],values[4],values[5],values[6],values[7],values[8],values[9],values[10],values[11],values[12],values[13],values[14],values[15],values[16],[],[]] # New Entry values
+        print values[9], values[27]
+        ccd_dict[currkey] = [values[0],values[1],values[2],values[3],values[4],values[5],values[6],values[7],values[8],values[9],values[10],values[11],values[12],values[13],values[14],values[15],values[16],[],[],[]] # New Entry values
     else:
         if values[17]: #DvString Enumerations
             ccd_dict[currkey][17].append((values[17],values[18],values[19]))
@@ -70,6 +72,8 @@
         elif values[25] != '': #DvBoolean true/false values
             bool_vals.append((values[25],values[26]))
 
+    if values[27]:
+        ccd_dict[currkey][19].append(values[27])
 
 for k in ccd_dict.keys():
     size = len(ccd_dict[k])
@@ -105,23 +109,28 @@
     if not ccd_dict[k][16]:
         ccd_dict[k][16] = " Not Defined"
 
-    #MLHIM ElementType
-    e_id = str(uuid.uuid4())
+    #MLHIM CCD.definition type
+    defin_id = str(uuid.uuid4())
 
     #MLHIM DataType
-    d_id = str(uuid.uuid4())
-
-    defin_type = "ElementType"
+    dt_id = str(uuid.uuid4())
+
+    if ccd_dict[k][19]:
+        defin_type = ccd_dict[k][19]
+    else:
+        defin_type = ["ElementType"]
+
     defin_str = ''
 
     # if the defin_type is an Entry type this is the entry-data type
+    # in the CDD this could also be an ElementType or ClusterType
     e_data_type = "ClusterType"
 
     schema = ccd_id + ".xsd"
 
     #/ccd setup
     title = ccd_dict[k][9]
-    print ""
+    print defin_type
     print "Generating: "+ title +" --  CCD ID = " + ccd_id
     print ccd_dict[k][14]
     ccd_catalog.write('{0:.<40}'.format(title) + ccd_id +'\n')
@@ -200,20 +209,20 @@
       <xs:complexContent>
         <xs:restriction base="mlhim2:CCDType">
         <xs:sequence>
-          <xs:element maxOccurs="1" minOccurs="1" ref="mlhim2:el-"""+e_id+""""/>
+          <xs:element maxOccurs="1" minOccurs="1" ref="mlhim2:el-"""+defin_id+""""/>
         </xs:sequence>
         </xs:restriction>
       </xs:complexContent>
     </xs:complexType>
-    <xs:element name='el-"""+e_id+"""' substitutionGroup="mlhim2:definition" type='mlhim2:ct-"""+e_id+"""'/>""")
-
-
-    if defin_type == "ElementType":
-        indent = 4
-        xsd_file.write(mlhim2RM.getElementType(e_id, d_id, indent))
-    elif defin_type == "CareEntryType":
-        indent = 4
-        xsd_file.write(mlhim2RM.getCareEntryType(data_name, e_id, docs, d_id, e_data_type, indent))
+    <xs:element name='el-"""+defin_id+"""' substitutionGroup="mlhim2:definition" type='mlhim2:ct-"""+defin_id+"""'/>""")
+
+
+    if defin_type[0] == "ElementType":
+        indent = 4
+        xsd_file.write(mlhim2RM.getElementType(defin_id, dt_id, indent))
+    elif defin_type[0] == "CareEntryType":
+        indent = 4
+        xsd_file.write(mlhim2RM.getCareEntryType(data_name, defin_id, dt_docs, dt_id, e_data_type, indent))
 
 
     if dt == "DvBooleanType":

=== modified file 'templates/CDD-2.4.0.xmt'
Binary files templates/CDD-2.4.0.xmt	2012-09-14 12:44:18 +0000 and templates/CDD-2.4.0.xmt	2012-09-23 11:57:38 +0000 differ