← Back to team overview

dolfin team mailing list archive

Re: XML parsing

 

On Fri, Aug 06, 2010 at 11:53:39PM +0100, Garth N. Wells wrote:
> On Sat, 2010-08-07 at 00:45 +0200, Anders Logg wrote:
> > On Fri, Aug 06, 2010 at 09:02:00PM +0100, Garth N. Wells wrote:
> > > I have trouble following the logic in the DOLFIN XML io code, which I
> > > suspect is because of the pointers to callback functions being passed to
> > > libxml2. Has anyone looked at libxml++
> > > (http://library.gnome.org/devel/libxml++-tutorial/stable/) which
> > > provides C++ bindings to libxml2?
> >
> > No, and I'm not sure if it's worth the trouble.
> >
> > I think the libxml2 interface is not much of a problem, at least not
> > since we realized how to only set the callbacks we are interested in:
> >
>
> The problem I have is that the few times that I've looked at it to fix
> or extend something I have trouble following how it works. A part of
> this is the lack of comments in the code.

That's probably true. But in this case, I'd say the quality of the
code is fairly good and the design well thought through in spite of
the lack of comments.

> >   XMLFile::XMLFile(std::ostream& s)
> >   : GenericFile(""), sax(0), outstream(&s)
> >   {
> >     // Set up the sax handler.
> >     sax = new xmlSAXHandler();
> >
> >     // Set up handlers for parser events
> >     sax->startDocument = sax_start_document;
> >     sax->endDocument   = sax_end_document;
> >     sax->startElement  = sax_start_element;
> >     sax->endElement    = sax_end_element;
> >     sax->warning       = sax_warning;
> >     sax->error         = sax_error;
> >     sax->fatalError    = sax_fatal_error;
> >   }
> >
> > Ola did a very good job at redesigning the parsing to allow reuse of
> > code. For example, we can reuse the parsing of MeshFunction data
> > inside the <data> tag inside the <mesh> tag. Previously, every top
> > level tag had its own implementation which prevented reuse of code for
> > parsing nested data.
> >
> > This is what makes the implementation difficult to follow, but it's
> > essentially a stack of parsers where we push and pop the handler
> > currently responsible for accepting the callbacks from libxml2 when it
> > reads data.
> >
> > I don't think it would be easier if we used the C++ interface. It
> > would probably be less transparent. Right now, our interaction with
> > libxml2 is very minimal: setting the callbacks and receiving the data
> > parsed by libxml2.
> >
>
> My first impression is that it would be more transparent and simpler -
> we just pass the file name to libxml++ and implement the functions
> 'on_start_element' and 'on_end_element'.

It's essentially what we do now but instead of subclassing and
implementing some functions, we specify the name of those functions:

  sax->startElement  = sax_start_element;
  sax->endElement    = sax_end_element;

and then pass the file name to begin the parsing:

  xmlSAXUserParseFile(sax, (void *) this, filename.c_str());

I agree C++ is a nicer way of doing it but I'm not sure it is worth
the effort. It might even complicate the design by introducing another
layer of inheritance in an already somewhat complex hierarchy of
classes.

--
Anders

Attachment: signature.asc
Description: Digital signature


References