← Back to team overview

dolfin team mailing list archive

Re: Status of parallel I/O

 


On 25/08/11 07:54, Anders Logg wrote:
> On Thu, Aug 25, 2011 at 07:45:49AM -0400, Garth N. Wells wrote:
> 
>>> Sure, but I would claim SAX scales better.
>>
>> In terms of memory, yes.
>>
>> It is not sufficiently scalable to be a total solution. Plus it's too slow.
> 
> Agree.
> 
>>> Wouldn't it be better to
>>> just use one of DOM or SAX?
>>
>> Maybe. A SAX implementation is considerably more complex. The new
>> implementation reserves this complexity for a possibly critical case and
>> localises the complexity of the code. The old code was very complex and
>> less localised.
>>
>> The locality means that it's no big deal to have a simple DOM
>> implementation for the majority of cases next to a more complex SAX
>> implementations for special cases. There is no point in the size and
>> complexity of a SAX parser for simple cases, e.g. reading parameter files.
> 
> Agree, but see below.
> 
>>> Either we use SAX all the way if it gives
>>> better performance than DOM,
>>
>> It doesn't give better performance. We discussed this before. Without
>> checking the archive, I recall that the DOM implementation was about 50
>> times faster for large data sets than the old SAX implementation.
>>
>>> or we use DOM all the way as a solution
>>> for medium sized problems and complement with HDF5 for large scale
>>> problems. Having DOM + SAX + HDF5 seems messy.
>>>
>>
>> This may happen, but the fact is that we don't have HDF5 in place yet.
>>
>>>>>>> What is the difference between XMLLocalMeshData and
>>>>>>> XMLLocalMeshDataDistributed etc.
>>>>>>>
>>>>>>
>>>>>> Initially I planned to use DOM for all, but as outlined above decided
>>>>>> after some testing to retain SAX for meshes (but update to SAX2, since
>>>>>> the libxml2 SAX parser is deprecated and has memory leaks). Hence,
>>>>>> XMLLocalMeshData uses DOM and XMLLocalMeshDataDistributed uses SAX. So
>>>>>> far I've kept the DOM version since it's easy to code and could be
>>>>>> useful when reading non-distributed meshes on each process (which may
>>>>>> differ on different processes).
>>>>>
>>>>> I don't understand the difference between XMLLocalMeshData and
>>>>> XMLLocalMeshDataDistributed. Is XMLLocalMeshDataDistributed doing now
>>>>> what XMLLocalMeshData did before?
>>>>>
>>>>
>>>> Yes, but updated to SAX2 (which was very painful).
>>>>
>>>> The 'new' XMLLocalMeshData is a DOM version. It could be removed.
>>>
>>> Or kept if we will add HDF5 anyway as a more scalable solution.
>>>
>>
>> Again, it may be desirable to keep a SAX parser for reading meshes in
>> parallel since a mesh is the most likely large data structure to be
>> created externally, and the most complex. HDF5 would require a user to
>> supply a binary mesh file rather than an XML file. Most other large data
>> sets are created internally, and the read and written. In this case,
>> HDF5 will be fine.
> 
> How about using DOM everywherme and reserve the use of SAX for an
> XML->HDF5 converter?
> 

That could be OK, but if we have the to implement a SAX parser it's
probably easiest to have it in DOLFIN anyway. I don't see the advantage
over having the SAX parser with the io code.

Garth

> --
> Anders



Follow ups

References