← Back to team overview

dolfin team mailing list archive

Re: binary file format

 

On Tue, Jan 13, 2009 at 03:50:32PM -0700, Bartosz Sawicki wrote:
> On 13/01/09 02:17 PM, Anders Logg wrote:
> > On Tue, Jan 13, 2009 at 12:31:58PM -0700, Bartosz Sawicki wrote:
> >> On 13/01/09 11:53 AM, Martin Sandve Alnæs wrote:
> >>> On Tue, Jan 13, 2009 at 6:26 PM, Anders Logg <logg@xxxxxxxxx> wrote:
> >>>> On Tue, Jan 13, 2009 at 05:24:41PM +0000, Garth N. Wells wrote:
> >>>>> Anders Logg wrote:
> >>>>>> On Tue, Jan 13, 2009 at 08:45:30AM +0100, Martin Sandve Alnæs wrote:
> >>>>>>> We just discussed that in another thread, and the answer from
> >>>>>>> everybody is yes please, that would be nice. XML file with metadata +
> >>>>>>> raw files with arrays is one solution. And of course, there are many
> >>>>>>> standardized file formats we could support in time for communication
> >>>>>>> with external tools.
> >>>>>> How about sticking everything into the XML file but with some suitable
> >>>>>> tags:
> >>>>>>
> >>>>> I think that this is what Martin is suggesting. VTK has a file format
> >>>>> like this (although with binary content it is no longer strictly XML as
> >>>>> far as I understand).
> >>>>>
> >>>>> Garth
> >>>> ok, I thought he meant having the data in a separate file.
> >>> I did, but I don't care. The advantage of separate binary files
> >>> is that the XML file can be read as a text file, both by humans
> >>> and machines. Also, binary data will have fixed computable
> >>> positions in the pure binary file, which might be useful for ... something?
> >> I also thought about completely binary file, but some XML header can be 
> >> useful for storing informations like element signatures, data types and 
> >> so on.
> >>
> >> The most important advantage of binary file is that you can read whole 
> >> vector at once directly into the memory. I'm not sure if XML parsers 
> >> provide such functionality. Usually binary data are encoded in base64 
> >> before placed in XML file (that's what they do in VTK).
> >>
> >> Binary storage should be as fast as possible, so we shouldn't care about 
> >> human readability. We have XML for that.
> >>
> >>
> >> BArtek
> > 
> > It should be simple to check for the <binary> tag (or similar) in the
> > XML parsers and then read a certain number of bytes in binary (using
> > fread), then continue parsing the XML file.
> > 
> > In a combined XML/binary file, there should only be a small amount of
> > XML tags to be read so it should be very efficient. The large part of
> > the file (in bytes) will be binary.
> > 
> > The problem might be to interact with the SAX parser (libxml2) to be
> > able to manually read a number of bytes and then hand the control back
> > to the parser.
> 
> Quick googling suggests that there's no API for binary data:
> http://mail.gnome.org/archives/xml/2006-August/msg00050.html
> 
> Maybe some clever tricks would be possible, but why do you insist to use 
>   XML? Wouldn't be better to design binary format in binary way, with 
> fixed positions, as Martin wrote.

It would be nice to have it in XML to make it at least partly human
readable. One may look at the file and see that it's a mesh, function,
vector, whatever and that there is a portion of binary data.

It's also convenient to have just one file instead of two.

-- 
Anders

Attachment: signature.asc
Description: Digital signature


Follow ups

References