← Back to team overview

dolfin team mailing list archive

Re: Status of parallel I/O

 

On Fri, Aug 26, 2011 at 07:11:11AM -0400, Garth N. Wells wrote:
>
>
> On 26/08/11 02:39, Anders Logg wrote:
> > On Thu, Aug 25, 2011 at 05:57:51PM -0400, Garth N. Wells wrote:
> >>
> >>
> >> On 25/08/11 16:53, Anders Logg wrote:
> >>> On Thu, Aug 25, 2011 at 09:59:44AM -0400, Garth N. Wells wrote:
> >>>
> >>>>>>> How about using DOM everywherme and reserve the use of SAX for an
> >>>>>>> XML->HDF5 converter?
> >>>>>>>
> >>>>>>
> >>>>>> That could be OK, but if we have the to implement a SAX parser it's
> >>>>>> probably easiest to have it in DOLFIN anyway. I don't see the advantage
> >>>>>> over having the SAX parser with the io code.
> >>>>>
> >>>>> I agree we should keep it in DOLFIN, but if the only thing it needs to
> >>>>> do is extract data and spit out HDF5, I imagine it can be simpler than
> >>>>> the current parser since it doesn't need to be parallel. (?)
> >>>>>
> >>>>
> >>>> To make things clearer, I've just renamed the LocalMeshData parsers to
> >>>>
> >>>>   XMLLocalMeshDOM (was XMLLocalMeshData)
> >>>>
> >>>> and
> >>>>
> >>>>   XMLLocalMeshSAX (was XMLLocalMeshDataDistributed)
> >>>
> >>> That's good.
> >>>
> >>>> When XMLLocalMeshSAX is complete, it may be desirable to remove
> >>>> XMLLocalMeshDOM.
> >>>
> >>> Either way is fine for me, as long as we decide which one to use. I
> >>> initially wanted to use SAX (as before) but the DOM looks easier and
> >>> may be enough if we plan to use HDF5 for large-scale problems anyway.
> >>> Or is it the case that DOM is a limitation even for medium sized
> >>> problems?
> >>>
> >>
> >> It works for 'medium' (very arbitrary) size problems.
> >>
> >>>> I don't know what you mean by parallel - the XMLLocalMeshSAX works in
> >>>> the same way as the old parser (each process reading a chunk). I don't
> >>>> see how it can be made simpler by reading a XML file and then converting
> >>>> to HDF5. The steps that are there now will all still be required to read
> >>>> the XML mesh before writing a HDF file.
> >>>
> >>> I don't know HDF, but I imagine one could write one single file and
> >>> HDF will handle parallel parsing of that file later. Then the
> >>> conversion script we write does not need to do anything parallel, just
> >>> read line by line and convert from one format to another.
> >>>
> >>
> >> It may not be possible to do it line-by-line (I don't know, but I
> >> wouldn't want to bank on it). Even if line-by-line is technically
> >> possible, it could turn out to be terribly slow. We should support that
> >> a mesh can be read into memory (distributed), and written to HDF5.
> >>
> >> Since we'll have support for writing HDF5 meshes, if we can read a large
> >> XML mesh then we can re-use the HDF5 output code to make the conversion.
> >>
> >> I've removed the DOM-based LocalMeshData parser - there is no point to
> >> it since we can just read the mesh on one process using XMLMesh and use
> >> it to construct a dolfin::XMLLocalMeshData object.
> >
> > ok, looks good.
> >
> > How should we store boundary indicators? I'm not sure whether it needs
> > to be stored as part of ParallelData. Is it really "parallel data"?
> > ParallelData will for sure need to be used to compute it (convert
> > somehow from the input) but it seems it can then be stored
> > locally.
>
> OK. It's not really parallel data (but perhaps ParallelData should be
> renamed).
>
> > Each facet just needs to know its indicator value.
> >
> > The input is a list of triples:
> >
> >   (indicator, facet_cell, facet_number)
> >
> > This indicates that local facet number `facet_number` of the cell
> > `facet_cell` should have the indicator value (sub domain number)
> > `indicator`.
> >
>
> Let's make it generic:
>
>   (parent_cell_index, entity_dim, local_entity_index, indicator/value)

Yes, that looks good. What about the XML format? It becomes unwieldy
to store it as 4 different MeshFunctions. Here's an initial sketch:

<mesh>
  # cells and vertices here as before
  <data>
    # user data here as before
  </data>
  <indicators dim="...">
    <indicator cell="..." local_entity_index="..." value="...">
    <indicator cell="..." local_entity_index="..." value="...">
    <indicator cell="..." local_entity_index="..." value="...">
    <indicator cell="..." local_entity_index="..." value="...">
  </indicators>
  <indicators dim="...">
    <indicator cell="..." local_entity_index="..." value="...">
    <indicator cell="..." local_entity_index="..." value="...">
    <indicator cell="..." local_entity_index="..." value="...">
    <indicator cell="..." local_entity_index="..." value="...">
  </indicators>
  ...
</mesh>

> It could go into MeshData (possibly with what's in ParallelData), and
> the current MeshData could be renamed to something like 'UserMeshData'.

I think it's better to keep the name MeshData for user-defined data
(and internal DOLFIN data stored there in waiting for a proper place
to store it). mesh.data() is used in many places in user code.

It would be better to each time we decide to amend the Mesh class with
new data to add a proper class to hold it, like ParallelData (possibly
renamed). How about a new class called "MeshIndicators" to hold mesh
indicators. It would need to handle initialization from various
sources of input data, in particular MeshFunctions, which is then
converted to some proper internal representation. The MeshIndicator
class should be "parallel aware" and not need any special extras in
ParallelData.

--
Anders


Follow ups

References