← Back to team overview

nova team mailing list archive

How to serialize a dict to XML

 

Hi Eric/Dietz/anyone else who cares to chime in,

I'd like to get feedback on the approach I am planning to take for
serializing controller results to XML/JSON/whatever.  I'm going to start
coding based on this approach, but if anyone has objections/suggestions I'm
more than happy to rework the code as needed.

*Background*
*
*
The idea is that if a request comes in for /servers/4.xml, we can detect the
requested content type via the Accept: header on the request, or in this
case the suffix on the URL, then take some generic data in a dict and
convert it to that format to send back in the response.  This is a
convenience for methods that handle HTTP requests, so that rather than
looking like this:

*def show(self, id):*
*  result = { 'id': 4, 'name': 'my server', 'ram': 512 }*
*  is_json = # check accept headers/suffix/etc in the right order*
*  if is_json:*
*    return json.dumps(result)*
*  else: # assume xml until we add other types*
*    return ('<server id="%(id)d"><name>%(name)s**</name>'*
*            '<ram>%(ram)d</ram></server>') % result*

they can look like this:
*
*
*def show(self, id):*
*  result = {'id': 4, 'name': 'my server', 'ram': 512}*
*  return Serializer(request).to_content_type(result)*
*
*
And to_content_type() can be the one place where we correctly determine the
content type and translate the dictionary properly.


*Problems*
*
*
Serializing a dict to XML is not a straightforward mapping, for two reasons:

   1. data atoms can arbitrarily be placed in an attribute or in a subtag
      - e.g. <foo *id*=4/> or <foo><*id*>4</id></foo>
   2. list items have to have names
      - e.g. { "boys": ["carl", "david"] } -> <boys><*boy*>carl</boy><*boy*
      >david</boy></boys>

I want to provide some translation metadata to to_content_type() to solve
these problems.


*Solution*
*
*
To solve #1, we specify the keys that should be attributes.  To solve #2, we
provide a map from plurals to singulars, unless the plural can just drop an
"s" to become singular.

This info is provided as metadata to Serializer, like Serializer(request, *
metadata*).to_content_type(data).

Let's say we had this dictionary that we wanted to convert to XML:

{
  "boy": {
    "age": 8,
    "id": 4,
    "name": "Michael",
    "neck": { "width": "14in" },
    "eyes": { "color": "blue" },
    "hands": [ "Mr. Lefty", "Mr. Righty" ],
    "feet": [
      {
        "toecount": 5,
        "fungi": [ "tinea" ]
      },
      {
        "toecount": 6,
        "fungi": [ "truffles", "crud" ]
      }
    ]
  }
}

Here's some metadata that could be applied to this dictionary:

metadata = {
  "application/xml": {
    "attributes": {
      "boy": [ "*id*", "*name*"],
      "neck": [ "*width*" ]
    },
    "plurals": {
      "feet": "*foot*",
      "fungi": "*fungus*"
    }
  }
}

The resulting XML would look like

*
<boy id="4" name="Michael">
  <age>8</age>
  <eyes>
    <color>blue</color>
  </eyes>
  <neck width="14in"/>
  <hands><hand>Mr. Lefty</hand><hand>Mr. Righty</hand></hands>
  <feet>
    <foot>
      <toecount>5</toecount>
      <fungi><fungus>tinea</fungus></fungi>
    </foot>
    <foot>
      <toecount>6</toecount>
      <fungi><fungus>truffles</fungus><fungus>crud</fungus></fungi>
    </foot>
  </feet>
</boy>

Note that Serializer knew how to convert "hands" to "hand" without being
told.


Pros and cons

Pro: We don't have to muck up the request dictionary with metadata -- it's
nicely separated.  So JSON serialization is still just json.dumps(data).

Pro: If we add more serialization types in the future, we don't have to
change Serializer()'s signature.

Pro: A Controller class can specify its metadata in one place, and make a
helper method like

def serialize(self, data):
  return Serializer(self.request, self.my_metadata).to_content_type(data)

Con: Lumping all the attribute data into one map ignores nesting, so the
metadata can't support a complex dictionary where "boy" appears at multiple
levels and has different attributes at those levels.  I doubt this will
happen to us, and if it does we can either make Serializer more complicated
or just not support it (and build the XML by hand in that one case.)


Please give me feedback.  I'm going to start coding down this route, but if
anyone has objections/improvements I'm happy to rework the code.

Michael
*


Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use of the
individual or entity to which this message is addressed, and unless otherwise
expressly indicated, is confidential and privileged information of Rackspace.
Any dissemination, distribution or copying of the enclosed material is prohibited.
If you receive this transmission in error, please notify us immediately by e-mail
at abuse@xxxxxxxxxxxxx, and delete the original message.
Your cooperation is appreciated.


Follow ups