← Back to team overview

dolfin team mailing list archive

Re: Status of parallel I/O

 


On 24/08/11 08:55, Kent-Andre Mardal wrote:
> 
> 
> On 24 August 2011 14:43, Garth N. Wells <gnw20@xxxxxxxxx
> <mailto:gnw20@xxxxxxxxx>> wrote:
> 
> 
> 
>     On 24/08/11 08:07, Anders Logg wrote:
>     > On Wed, Aug 24, 2011 at 07:22:53AM -0400, Garth N. Wells wrote:
>     >>
>     >>
>     >> On 24/08/11 03:50, Anders Logg wrote:
>     >>> What is the plan for XMLLocalMeshData (using the DOM interface) vs
>     >>> XMLLocalMeshDataDistributed (using the SAX interface)?
>     >>>
>     >>
>     >> Both for the time being.
>     >>
>     >>> Reading boundary indicators is currently failing with
>     >>>
>     >>> RuntimeError: *** Error: Inconsistent state in XML reader: 6.
>     >>>
>     >>> Should this be fixed in XMLLocalMeshDataDistributed or is the
>     plan to
>     >>> replace it with XMLLocalMeshData?
>     >>>
>     >>
>     >> In XMLLocalMeshDataDistributed.
>     >
>     > Could you elaborate? The functionality for reading and distributing
>     > boundary markers (in parallel) is currently broken and we want to fix
>     > it. But we need to know more about the design. I don't want to fix
>     > something if you decide to break it 5 min later.
>     >
> 
>     It never 'properly' worked in parallel. There were some messy ad hoc
>     changes made on top of functions that were planned for overhaul. I made
>     clear before this that parallel functionality was being sorted out (it's
>     not just in io, but also partitioning, etc), so the fact that it's not
>     working now should not be a surprise.
> 
>     There is a lot of missing functionality is parallel, so patience is
>     required to get things done properly.
> 
> 
> It worked! And there was a unit test there that tested that it worked. 
> However, you use the term 'properly' here, probably to indicate something. 
> 

I could have written that it 'ran' ;).

It also wasn't a unit test - it was just one of the regression tests
placed in the unit test directory.

> What is the way forward here other than patience? 

It won't be long - hopefully in the next month.

> It seems to me that 
> the only thing that is actually broken is the parallel io. 
> 

If it was just io, that would be easy. The more challenging part is to
address the global-local numbering in the Mesh class, and distributing
this data around. The old code that 'ran' had ad hoc lines added in the
partitioning code. The partitioning function was one huge function that
failed for some cases (like graphs with no vertices on a process). It's
been straightened out and broken up to a degree, and bugs fixes made,
but this involved removing the ad hoc lines.

Garth

> 
>  
>  
> 
> 
>     > Should we continue to use libxml2? Why not use the DOM parsing all the
>     > way?
>     >
> 
>     Because I'm inclined to keep SAX parsing for meshes since meshes are the
>     most likely to be created externally, and need to be scalable for
>     reading. Other objects (e.g., vectors) are likely to be created and
>     written by DOLFIN, so will eventually use parallel HDF5 for scalable
>     parallel io.
> 
>     > What is the difference between XMLLocalMeshData and
>     > XMLLocalMeshDataDistributed etc.
>     >
> 
>     Initially I planned to use DOM for all, but as outlined above decided
>     after some testing to retain SAX for meshes (but update to SAX2, since
>     the libxml2 SAX parser is deprecated and has memory leaks). Hence,
>     XMLLocalMeshData uses DOM and XMLLocalMeshDataDistributed uses SAX. So
>     far I've kept the DOM version since it's easy to code and could be
>     useful when reading non-distributed meshes on each process (which may
>     differ on different processes).
> 
>     Garth
> 
>     > --
>     > Anders
>     >
> 
> 
>     _______________________________________________
>     Mailing list: https://launchpad.net/~dolfin
>     Post to     : dolfin@xxxxxxxxxxxxxxxxxxx
>     <mailto:dolfin@xxxxxxxxxxxxxxxxxxx>
>     Unsubscribe : https://launchpad.net/~dolfin
>     More help   : https://help.launchpad.net/ListHelp
> 
> 



Follow ups

References