dolfin team mailing list archive

Thread
Date

Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.

To: "Garth N. Wells" <gnw20@xxxxxxxxx>
From: Anders Logg <logg@xxxxxxxxx>
Date: Sun, 14 Mar 2010 21:33:04 +0100
Cc: DOLFIN Mailing List <dolfin@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <4B9D0972.8070906@cam.ac.uk>
User-agent: Mutt/1.5.20 (2009-06-14)

On Sun, Mar 14, 2010 at 04:06:10PM +0000, Garth N. Wells wrote:
>
>
> Anders Logg wrote:
> > On Sun, Mar 14, 2010 at 03:42:29PM +0000, Garth N. Wells wrote:
> >>
> >> Anders Logg wrote:
> >>> On Sun, Mar 14, 2010 at 08:45:39AM +0000, Garth N. Wells wrote:
> >>>> Anders Logg wrote:
> >>>>> On Sun, Mar 14, 2010 at 08:35:32AM +0000, Garth N. Wells wrote:
> >>>>>> Anders Logg wrote:
> >>>>>>> On Sun, Mar 14, 2010 at 07:39:45AM +0000, Garth N. Wells wrote:
> >>>>>>>> Anders Logg wrote:
> >>>>>>>>> On Fri, Mar 12, 2010 at 06:58:22PM -0000, noreply@xxxxxxxxxxxxx wrote:
> >>>>>>>>>> ------------------------------------------------------------
> >>>>>>>>>> revno: 4635
> >>>>>>>>>> committer: Garth N. Wells <gnw20@xxxxxxxxx>
> >>>>>>>>>> branch nick: dolfin-all
> >>>>>>>>>> timestamp: Fri 2010-03-12 18:53:05 +0000
> >>>>>>>>>> message:
> >>>>>>>>>>   Work on reading Vectors in parallel. Some issues to resolve still.
> >>>>>>>>>>
> >>>>>>>>>>   Some issues:
> >>>>>>>>>>   - How should files be named when in parallel?
> >>>>>>>>>>   - Should we have a 'master' xml file which points to the files
> >>>>>>>>>>   - from different processes?
> >>>>>>>>> I think this should be done in the same way as for Meshes. We
> >>>>>>>>> discussed the following design:
> >>>>>>>>>
> >>>>>>>>> 1. Reading a single file "foo.xml" results in each process reading the
> >>>>>>>>> entire file but skipping data located on another process as determined
> >>>>>>>>> by local_range. This is what is implemented now for meshes (followed
> >>>>>>>>> by communication and mesh partitioning). The difference for vectors
> >>>>>>>>> would be that no extra communication is necessary.
> >>>>>>>>>
> >>>>>>>> OK.
> >>>>>>>>
> >>>>>>>>> 2. Reading a set of files "foo*.xml" results in each process reading
> >>>>>>>>> its portion stored in "foo%d.xml" % p. The File interface then needs
> >>>>>>>>> to check for the occurence of '*' and figure out the correct file name
> >>>>>>>>> based on its process number.
> >>>>>>>>>
> >>>>>>>> I think that are a number of advantages to having a single .xml that
> >>>>>>>> points to the 'sub-files'. An obvious advantage is that we won't need to
> >>>>>>>> distinguish between cases 1 and 2 when reading in a vector.
> >>>>>>>>
> >>>>>>>> Garth
> >>>>>>> I don't feel strongly about either option, but if we go for the
> >>>>>>> master-file/sub-file design I think we should do the same for vectors
> >>>>>>> and meshes.
> >>>>>>>
> >>>>>>> The master file could look something like this for vectors:
> >>>>>>>
> >>>>>>>   <distributed_vector size="1024" num_partitions="16">
> >>>>>>>     <sub_vector partition="0" file="foo_0.xml" offset="0"/>
> >>>>>>>     <sub_vector partition="1" file="foo_1.xml" offset="64"/>
> >>>>>>>     <sub_vector partition="2" file="foo_2.xml" offset="128"/>
> >>>>>>>     ...
> >>>>>>>   </distributed_vector>
> >>>>>>>
> >>>>>> Looks good, except 'offset' should be 'size', or 'local_size'.
> >>>>> Yes, but then maybe it's not needed since the local size will be
> >>>>> available in the local files (which can be standard XML vector data).
> >>>>>
> >>>>> But then won't the master files always be trivial? The only extra
> >>>>> information that is contained in the master file is the total size,
> >>>>> and the number of partitions (which will only be used to check that it
> >>>>> matches the actual number of processes).
> >>>>>
> >>>> The master file is the definitive file. Say a program is run with 4
> >>>> processes, and then with 2.  The files vector_0.xml, vector_1.xml,
> >>>> vector_2.xml and vector_3.xml will be floating around, but which files
> >>>> make up the vector? The master file will point to vector_0.xml and
> >>>> vector_1.xml.
> >>> I don't understand how that would work. Would it repartition the
> >>> entire vector or just use the first two?
> >>>
> >> It would read the first two. What the program does with them from that
> >> point onwards is separate issue.
> >
> > That seems like a strange situation. Will that ever happen? (Storing
> > data from n processes and then reading back a subset on m < n
> > processes.)
> >
>
> It could very well happen, for example reading data in on one process to
> manipilate it, restart a computation with a different number of
> processes, etc.

It sounds strange. If one process should read in some specific data,
then it can just access foo_p.xml directly (without working through
some master file).

> >>>> Also, there should be no need to check that the number of 'partitions'
> >>>> matches the number of processes.
> >>> That seems to be the only real use of having a master file, at least
> >>> the only extra information contained in the master file and not
> >>> contained in the local files.
> >>>
> >> The master file *defines* which files are the sub files. For example, a
> >> collection of .xml files could be read by a single process program, just
> >> like ParaView does.
> >
> > Yes, but those files will most likely always have the same numbering
> > scheme (if stored from DOLFIN), something like foo_1.xml, foo_2.xml
> > etc. Then we might as well do "foo_*.xml".
> >
>
> That's not my point. If I have a directory full of foo_*.xml how can I
> know which ones make up the vector? It precisely analogous to VTK. My
> directory can be full of .vtu files, but by opening .pvd I can always
> correctly visualise a result.

Yes, that's a good point.

--
Anders

Attachment: signature.asc
Description: Digital signature

References

Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Anders Logg, 2010-03-12
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Garth N. Wells, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Anders Logg, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Garth N. Wells, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Anders Logg, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Garth N. Wells, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Anders Logg, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Garth N. Wells, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Anders Logg, 2010-03-14
Re: [Branch ~dolfin-core/dolfin/main] Rev 4635: Work on reading Vectors in parallel. Some issues to resolve still.
From: Garth N. Wells, 2010-03-14