← Back to team overview

dolfin team mailing list archive

Re: binary file format

 

Bartosz Sawicki wrote:
> On 13/01/09 02:17 PM, Anders Logg wrote:
>> On Tue, Jan 13, 2009 at 12:31:58PM -0700, Bartosz Sawicki wrote:
>>> On 13/01/09 11:53 AM, Martin Sandve Alnæs wrote:
>>>> On Tue, Jan 13, 2009 at 6:26 PM, Anders Logg <logg@xxxxxxxxx> wrote:
>>>>> On Tue, Jan 13, 2009 at 05:24:41PM +0000, Garth N. Wells wrote:
>>>>>> Anders Logg wrote:
>>>>>>> On Tue, Jan 13, 2009 at 08:45:30AM +0100, Martin Sandve Alnæs wrote:
>>>>>>>> We just discussed that in another thread, and the answer from
>>>>>>>> everybody is yes please, that would be nice. XML file with metadata +
>>>>>>>> raw files with arrays is one solution. And of course, there are many
>>>>>>>> standardized file formats we could support in time for communication
>>>>>>>> with external tools.
>>>>>>> How about sticking everything into the XML file but with some suitable
>>>>>>> tags:
>>>>>>>
>>>>>> I think that this is what Martin is suggesting. VTK has a file format
>>>>>> like this (although with binary content it is no longer strictly XML as
>>>>>> far as I understand).
>>>>>>
>>>>>> Garth
>>>>> ok, I thought he meant having the data in a separate file.
>>>> I did, but I don't care. The advantage of separate binary files
>>>> is that the XML file can be read as a text file, both by humans
>>>> and machines. Also, binary data will have fixed computable
>>>> positions in the pure binary file, which might be useful for ... something?
>>> I also thought about completely binary file, but some XML header can be 
>>> useful for storing informations like element signatures, data types and 
>>> so on.
>>>
>>> The most important advantage of binary file is that you can read whole 
>>> vector at once directly into the memory. I'm not sure if XML parsers 
>>> provide such functionality. Usually binary data are encoded in base64 
>>> before placed in XML file (that's what they do in VTK).
>>>
>>> Binary storage should be as fast as possible, so we shouldn't care about 
>>> human readability. We have XML for that.
>>>
>>>
>>> BArtek
>> It should be simple to check for the <binary> tag (or similar) in the
>> XML parsers and then read a certain number of bytes in binary (using
>> fread), then continue parsing the XML file.
>>
>> In a combined XML/binary file, there should only be a small amount of
>> XML tags to be read so it should be very efficient. The large part of
>> the file (in bytes) will be binary.
>>
>> The problem might be to interact with the SAX parser (libxml2) to be
>> able to manually read a number of bytes and then hand the control back
>> to the parser.
> 
> Quick googling suggests that there's no API for binary data:
> http://mail.gnome.org/archives/xml/2006-August/msg00050.html
> 
> Maybe some clever tricks would be possible, but why do you insist to use 
>   XML? Wouldn't be better to design binary format in binary way, with 
> fixed positions, as Martin wrote.

XML files containing binary data are not standard conforming. However,
there is a trick that is used in the newer VTK file formats, which are
XML based. Binary data is encoded in text form using uuencode. It
doesn't make it human readable though :) Compressing data is also
supported in this case.

Sincerely,
Victor Prosolin.


begin:vcard
fn:Victor Prosolin
n:Prosolin;Victor
org:University of Calgary;Department of Physics and Astronomy
title:PhD student
version:2.1
end:vcard


References