← Back to team overview

dolfin team mailing list archive

Re: binary file format

 

On 13/01/09 02:17 PM, Anders Logg wrote:
On Tue, Jan 13, 2009 at 12:31:58PM -0700, Bartosz Sawicki wrote:
On 13/01/09 11:53 AM, Martin Sandve Alnæs wrote:
On Tue, Jan 13, 2009 at 6:26 PM, Anders Logg <logg@xxxxxxxxx> wrote:
On Tue, Jan 13, 2009 at 05:24:41PM +0000, Garth N. Wells wrote:
Anders Logg wrote:
On Tue, Jan 13, 2009 at 08:45:30AM +0100, Martin Sandve Alnæs wrote:
We just discussed that in another thread, and the answer from
everybody is yes please, that would be nice. XML file with metadata +
raw files with arrays is one solution. And of course, there are many
standardized file formats we could support in time for communication
with external tools.
How about sticking everything into the XML file but with some suitable
tags:

I think that this is what Martin is suggesting. VTK has a file format
like this (although with binary content it is no longer strictly XML as
far as I understand).

Garth
ok, I thought he meant having the data in a separate file.
I did, but I don't care. The advantage of separate binary files
is that the XML file can be read as a text file, both by humans
and machines. Also, binary data will have fixed computable
positions in the pure binary file, which might be useful for ... something?
I also thought about completely binary file, but some XML header can be useful for storing informations like element signatures, data types and so on.

The most important advantage of binary file is that you can read whole vector at once directly into the memory. I'm not sure if XML parsers provide such functionality. Usually binary data are encoded in base64 before placed in XML file (that's what they do in VTK).

Binary storage should be as fast as possible, so we shouldn't care about human readability. We have XML for that.


BArtek

It should be simple to check for the <binary> tag (or similar) in the
XML parsers and then read a certain number of bytes in binary (using
fread), then continue parsing the XML file.

In a combined XML/binary file, there should only be a small amount of
XML tags to be read so it should be very efficient. The large part of
the file (in bytes) will be binary.

The problem might be to interact with the SAX parser (libxml2) to be
able to manually read a number of bytes and then hand the control back
to the parser.

Quick googling suggests that there's no API for binary data:
http://mail.gnome.org/archives/xml/2006-August/msg00050.html

Maybe some clever tricks would be possible, but why do you insist to use XML? Wouldn't be better to design binary format in binary way, with fixed positions, as Martin wrote.

BArtek


Follow ups

References