← Back to team overview

dolfin team mailing list archive

Re: Status of parallel I/O

 


On 25/08/11 09:02, Anders Logg wrote:
> On Thu, Aug 25, 2011 at 08:00:17AM -0400, Garth N. Wells wrote:
>>
>>
>> On 25/08/11 07:54, Anders Logg wrote:
>>> On Thu, Aug 25, 2011 at 07:45:49AM -0400, Garth N. Wells wrote:
>>>
>>>>> Sure, but I would claim SAX scales better.
>>>>
>>>> In terms of memory, yes.
>>>>
>>>> It is not sufficiently scalable to be a total solution. Plus it's too slow.
>>>
>>> Agree.
>>>
>>>>> Wouldn't it be better to
>>>>> just use one of DOM or SAX?
>>>>
>>>> Maybe. A SAX implementation is considerably more complex. The new
>>>> implementation reserves this complexity for a possibly critical case and
>>>> localises the complexity of the code. The old code was very complex and
>>>> less localised.
>>>>
>>>> The locality means that it's no big deal to have a simple DOM
>>>> implementation for the majority of cases next to a more complex SAX
>>>> implementations for special cases. There is no point in the size and
>>>> complexity of a SAX parser for simple cases, e.g. reading parameter files.
>>>
>>> Agree, but see below.
>>>
>>>>> Either we use SAX all the way if it gives
>>>>> better performance than DOM,
>>>>
>>>> It doesn't give better performance. We discussed this before. Without
>>>> checking the archive, I recall that the DOM implementation was about 50
>>>> times faster for large data sets than the old SAX implementation.
>>>>
>>>>> or we use DOM all the way as a solution
>>>>> for medium sized problems and complement with HDF5 for large scale
>>>>> problems. Having DOM + SAX + HDF5 seems messy.
>>>>>
>>>>
>>>> This may happen, but the fact is that we don't have HDF5 in place yet.
>>>>
>>>>>>>>> What is the difference between XMLLocalMeshData and
>>>>>>>>> XMLLocalMeshDataDistributed etc.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Initially I planned to use DOM for all, but as outlined above decided
>>>>>>>> after some testing to retain SAX for meshes (but update to SAX2, since
>>>>>>>> the libxml2 SAX parser is deprecated and has memory leaks). Hence,
>>>>>>>> XMLLocalMeshData uses DOM and XMLLocalMeshDataDistributed uses SAX. So
>>>>>>>> far I've kept the DOM version since it's easy to code and could be
>>>>>>>> useful when reading non-distributed meshes on each process (which may
>>>>>>>> differ on different processes).
>>>>>>>
>>>>>>> I don't understand the difference between XMLLocalMeshData and
>>>>>>> XMLLocalMeshDataDistributed. Is XMLLocalMeshDataDistributed doing now
>>>>>>> what XMLLocalMeshData did before?
>>>>>>>
>>>>>>
>>>>>> Yes, but updated to SAX2 (which was very painful).
>>>>>>
>>>>>> The 'new' XMLLocalMeshData is a DOM version. It could be removed.
>>>>>
>>>>> Or kept if we will add HDF5 anyway as a more scalable solution.
>>>>>
>>>>
>>>> Again, it may be desirable to keep a SAX parser for reading meshes in
>>>> parallel since a mesh is the most likely large data structure to be
>>>> created externally, and the most complex. HDF5 would require a user to
>>>> supply a binary mesh file rather than an XML file. Most other large data
>>>> sets are created internally, and the read and written. In this case,
>>>> HDF5 will be fine.
>>>
>>> How about using DOM everywherme and reserve the use of SAX for an
>>> XML->HDF5 converter?
>>>
>>
>> That could be OK, but if we have the to implement a SAX parser it's
>> probably easiest to have it in DOLFIN anyway. I don't see the advantage
>> over having the SAX parser with the io code.
> 
> I agree we should keep it in DOLFIN, but if the only thing it needs to
> do is extract data and spit out HDF5, I imagine it can be simpler than
> the current parser since it doesn't need to be parallel. (?)
>

To make things clearer, I've just renamed the LocalMeshData parsers to

  XMLLocalMeshDOM (was XMLLocalMeshData)

and

  XMLLocalMeshSAX (was XMLLocalMeshDataDistributed)

When XMLLocalMeshSAX is complete, it may be desirable to remove
XMLLocalMeshDOM.

I don't know what you mean by parallel - the XMLLocalMeshSAX works in
the same way as the old parser (each process reading a chunk). I don't
see how it can be made simpler by reading a XML file and then converting
to HDF5. The steps that are there now will all still be required to read
the XML mesh before writing a HDF file.

Garth


> --
> Anders



Follow ups

References