← Back to team overview

dolfin team mailing list archive

Re: Status of parallel I/O

 

On Fri, Aug 26, 2011 at 06:54:31PM -0700, Garth N. Wells wrote:
>
>
> On 26/08/11 07:27, Anders Logg wrote:
> > On Fri, Aug 26, 2011 at 07:11:11AM -0400, Garth N. Wells wrote:
> >>
> >>
> >> On 26/08/11 02:39, Anders Logg wrote:
> >>> On Thu, Aug 25, 2011 at 05:57:51PM -0400, Garth N. Wells wrote:
> >>>>
> >>>>
> >>>> On 25/08/11 16:53, Anders Logg wrote:
> >>>>> On Thu, Aug 25, 2011 at 09:59:44AM -0400, Garth N. Wells wrote:
> >>>>>
> >>>>>>>>> How about using DOM everywherme and reserve the use of SAX for an
> >>>>>>>>> XML->HDF5 converter?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> That could be OK, but if we have the to implement a SAX parser it's
> >>>>>>>> probably easiest to have it in DOLFIN anyway. I don't see the advantage
> >>>>>>>> over having the SAX parser with the io code.
> >>>>>>>
> >>>>>>> I agree we should keep it in DOLFIN, but if the only thing it needs to
> >>>>>>> do is extract data and spit out HDF5, I imagine it can be simpler than
> >>>>>>> the current parser since it doesn't need to be parallel. (?)
> >>>>>>>
> >>>>>>
> >>>>>> To make things clearer, I've just renamed the LocalMeshData parsers to
> >>>>>>
> >>>>>>   XMLLocalMeshDOM (was XMLLocalMeshData)
> >>>>>>
> >>>>>> and
> >>>>>>
> >>>>>>   XMLLocalMeshSAX (was XMLLocalMeshDataDistributed)
> >>>>>
> >>>>> That's good.
> >>>>>
> >>>>>> When XMLLocalMeshSAX is complete, it may be desirable to remove
> >>>>>> XMLLocalMeshDOM.
> >>>>>
> >>>>> Either way is fine for me, as long as we decide which one to use. I
> >>>>> initially wanted to use SAX (as before) but the DOM looks easier and
> >>>>> may be enough if we plan to use HDF5 for large-scale problems anyway.
> >>>>> Or is it the case that DOM is a limitation even for medium sized
> >>>>> problems?
> >>>>>
> >>>>
> >>>> It works for 'medium' (very arbitrary) size problems.
> >>>>
> >>>>>> I don't know what you mean by parallel - the XMLLocalMeshSAX works in
> >>>>>> the same way as the old parser (each process reading a chunk). I don't
> >>>>>> see how it can be made simpler by reading a XML file and then converting
> >>>>>> to HDF5. The steps that are there now will all still be required to read
> >>>>>> the XML mesh before writing a HDF file.
> >>>>>
> >>>>> I don't know HDF, but I imagine one could write one single file and
> >>>>> HDF will handle parallel parsing of that file later. Then the
> >>>>> conversion script we write does not need to do anything parallel, just
> >>>>> read line by line and convert from one format to another.
> >>>>>
> >>>>
> >>>> It may not be possible to do it line-by-line (I don't know, but I
> >>>> wouldn't want to bank on it). Even if line-by-line is technically
> >>>> possible, it could turn out to be terribly slow. We should support that
> >>>> a mesh can be read into memory (distributed), and written to HDF5.
> >>>>
> >>>> Since we'll have support for writing HDF5 meshes, if we can read a large
> >>>> XML mesh then we can re-use the HDF5 output code to make the conversion.
> >>>>
> >>>> I've removed the DOM-based LocalMeshData parser - there is no point to
> >>>> it since we can just read the mesh on one process using XMLMesh and use
> >>>> it to construct a dolfin::XMLLocalMeshData object.
> >>>
> >>> ok, looks good.
> >>>
> >>> How should we store boundary indicators? I'm not sure whether it needs
> >>> to be stored as part of ParallelData. Is it really "parallel data"?
> >>> ParallelData will for sure need to be used to compute it (convert
> >>> somehow from the input) but it seems it can then be stored
> >>> locally.
> >>
> >> OK. It's not really parallel data (but perhaps ParallelData should be
> >> renamed).
> >>
> >>> Each facet just needs to know its indicator value.
> >>>
> >>> The input is a list of triples:
> >>>
> >>>   (indicator, facet_cell, facet_number)
> >>>
> >>> This indicates that local facet number `facet_number` of the cell
> >>> `facet_cell` should have the indicator value (sub domain number)
> >>> `indicator`.
> >>>
> >>
> >> Let's make it generic:
> >>
> >>   (parent_cell_index, entity_dim, local_entity_index, indicator/value)
> >
> > Yes, that looks good. What about the XML format? It becomes unwieldy
> > to store it as 4 different MeshFunctions. Here's an initial sketch:
> >
> > <mesh>
> >   # cells and vertices here as before
> >   <data>
> >     # user data here as before
> >   </data>
> >   <indicators dim="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >   </indicators>
> >   <indicators dim="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >     <indicator cell="..." local_entity_index="..." value="...">
> >   </indicators>
> >   ...
> > </mesh>
> >
>
> That looks OK, except for the name 'indicator(s)'.
>
> Essentially what it is is a MeshFunction that is defined only on a
> subset of entities of a given dimension. I think we should template it
> on the C++ side so that any data can be attached. We should then have
>
>    <indicators dim="..." type="...">
>
> (but with something other than 'indicators'). Internally, there may be
> no need to construct a MeshFunction.
>
>
> >> It could go into MeshData (possibly with what's in ParallelData), and
> >> the current MeshData could be renamed to something like 'UserMeshData'.
> >
> > I think it's better to keep the name MeshData for user-defined data
> > (and internal DOLFIN data stored there in waiting for a proper place
> > to store it). mesh.data() is used in many places in user code.
> >
> > It would be better to each time we decide to amend the Mesh class with
> > new data to add a proper class to hold it, like ParallelData (possibly
> > renamed). How about a new class called "MeshIndicators" to hold mesh
> > indicators. It would need to handle initialization from various
> > sources of input data, in particular MeshFunctions, which is then
> > converted to some proper internal representation. The MeshIndicator
> > class should be "parallel aware" and not need any special extras in
> > ParallelData.
> >
>
> All fine if we can find a more appropriate name than 'Indicator'.

How about SubsetFunction and we template it the same way as we do
MeshFunctions?

Or SubsetIndicators.

--
Anders


Follow ups

References