← Back to team overview

fenics team mailing list archive

Re: Docstrings etc

 

On Thu, Aug 26, 2010 at 10:34:02PM +0200, Kristian Ølgaard wrote:
> On 26 August 2010 22:13, Anders Logg <logg@xxxxxxxxx> wrote:
> > On Thu, Aug 26, 2010 at 10:09:16PM +0200, Kristian Ølgaard wrote:
> >> On 26 August 2010 22:04, Anders Logg <logg@xxxxxxxxx> wrote:
> >> > On Thu, Aug 26, 2010 at 09:34:01PM +0200, Kristian Ølgaard wrote:
> >> >> On 26 August 2010 20:35, Anders Logg <logg@xxxxxxxxx> wrote:
> >> >> > On Thu, Aug 26, 2010 at 08:16:41PM +0200, Anders Logg wrote:
> >> >> >> On Thu, Aug 26, 2010 at 08:09:56PM +0200, Kristian Ølgaard wrote:
> >> >> >> > On 26 August 2010 19:51, Anders Logg <logg@xxxxxxxxx> wrote:
> >> >> >> > > On Thu, Aug 26, 2010 at 07:42:35PM +0200, Kristian Ølgaard wrote:
> >> >> >> > >> On 26 August 2010 18:22, Anders Logg <logg@xxxxxxxxx> wrote:
> >> >> >> > >> > I've thought some more on how to organize/synchronize the FEniCS
> >> >> >> > >> > documentation (in fenics-doc) with the documentation we have in the
> >> >> >> > >> > code.
> >> >> >> > >> >
> >> >> >> > >> > I think it is important that
> >> >> >> > >> >
> >> >> >> > >> > (1) the strings we have in the code are the same as those that appear
> >> >> >> > >> > on in the HTML documentation (which we write in Sphinx).
> >> >> >> > >> >
> >> >> >> > >> > (2) the strings we have in the code are short (so they don't clutter
> >> >> >> > >> > up the code)
> >> >> >> > >>
> >> >> >> > >> I disagree. The whole idea of the documentation effort was to document
> >> >> >> > >> in one place
> >> >> >> > >> (using carefully handwritten and elaborate explanations including
> >> >> >> > >> examples and links to demos etc.) and code in another.
> >> >> >> > >> The comments in the code should be very short and precise such that
> >> >> >> > >> together with the class/function definition and type info the
> >> >> >> > >> developer can complete the task without looking elsewhere. These kind
> >> >> >> > >> of comments, I expect, will look weird when put next to an elaborate
> >> >> >> > >> explanation on how the class/function works including all the bells
> >> >> >> > >> and whistles.
> >> >> >> > >>
> >> >> >> > >> > If we look at these two, it seems that (1) implies that we should
> >> >> >> > >> > write the documentation as part of the code and then extract it using
> >> >> >> > >> > some tool.
> >> >> >> > >> >
> >> >> >> > >> > But (2) prevents that since we don't want to constrain the
> >> >> >> > >> > documentation for all functions to be very short.
> >> >> >> > >> >
> >> >> >> > >> > How about the following solution.
> >> >> >> > >> >
> >> >> >> > >> > * Write short docstrings in the code
> >> >> >> > >> >
> >> >> >> > >> > * Auto-generate all the .rst input files for the Programmer's
> >> >> >> > >> >  Reference using a simple Python script that looks for '///'
> >> >> >> > >> >
> >> >> >> > >> > * The script looks at the code to generate the signature of the
> >> >> >> > >> >  function and the text that comes immediately after.
> >> >> >> > >>
> >> >> >> > >> This might be possible for a simple
> >> >> >> > >> 'change-order-of-comment-and-function' script where you manipulate the
> >> >> >> > >> output manually afterwards, but if you want to run this more than once
> >> >> >> > >> you will have to pick up nested class/struct definitions templates and
> >> >> >> > >> all kinds of crap.
> >> >> >> > >> I tried to write a parser like this to check if all classes and
> >> >> >> > >> functions were documented, but gave up and let Doxygen do the dirty
> >> >> >> > >> work. (But do we want to do this just to generate 20 characters of
> >> >> >> > >> docstring automatically?)
> >> >> >> > >>
> >> >> >> > >> >  But it also looks in a hand-written .rst file that contains any
> >> >> >> > >> >  additional stuff we want to put below.
> >> >> >> > >> >
> >> >> >> > >> > So for the code example in the style manual, the things that get
> >> >> >> > >> > picked up from the code are
> >> >> >> > >> >
> >> >> >> > >> >  // Return the cell which is closest to the given point
> >> >> >> > >> >  uint closest_cell(const Point & point) const
> >> >> >> > >> >
> >> >> >> > >> > which gets converted to
> >> >> >> > >> >
> >> >> >> > >> > .. cpp:function:: uint closest_cell(const Point & point) const
> >> >> >> > >> >
> >> >> >> > >> >    Return the cell which is closest to the given point
> >> >> >> > >> >
> >> >> >> > >> > The script also looks in a file for "closest_cell" below which we have
> >> >> >> > >> > written all the *Arguments* stuff that will be thrown in below.
> >> >> >> > >> >
> >> >> >> > >> > Will that work?
> >> >> >> > >>
> >> >> >> > >> Yes, but the work flow is getting complex, and you'll need to know
> >> >> >> > >> what you get from the source code so you don't repeat yourself.
> >> >> >> > >> It is much easier to have the documentation in one place.
> >> >> >> > >>
> >> >> >> > >> > Another solution would be to just write everything as part of the
> >> >> >> > >> > code, and just add some settings to our editors that will fold the
> >> >> >> > >> > extra stuff away so we don't need to see it. Maybe that is the most
> >> >> >> > >> > robust solution?
> >> >> >> > >>
> >> >> >> > >> The general consensus the last time this issue came up was not to
> >> >> >> > >> clutter the code with documentation markup.
> >> >> >> > >>
> >> >> >> > >> Kristian
> >> >> >> > >
> >> >> >> > > I agree it's good to have the documentation in one place, but it would
> >> >> >> > > be good if we found a way to keep it in sync. Helper scripts can do
> >> >> >> > > some of that work, but we probably won't be able to pick up things
> >> >> >> > > like having
> >> >> >> > >
> >> >> >> > >  "Compute the number of neighbors"
> >> >> >> > >
> >> >> >> > > in one place and
> >> >> >> > >
> >> >> >> > >  "Return the number of neighbors"
> >> >> >> > >
> >> >> >> > > in other places. Things like this will creep in over time. It might
> >> >> >> > > not be a big issue but I find it a bit annoying.
> >> >> >> >
> >> >> >> > I see. A simpler approach, rather than generating docstrings would be
> >> >> >> > to have a script that
> >> >> >> > simply looks for '///' comments in dolfin/mesh/Mesh.h and check if the
> >> >> >> > EXACT same strings are present in
> >> >> >> > programmers-reference/cpp/mesh/Mesh.rst, if not crash test and let
> >> >> >> > user figure out manually why it failed and which comment/docstring
> >> >> >> > should be changed.
> >> >> >> > This won't be completely bulletproof, but much much simpler than
> >> >> >> > parsing a C++ library.
> >> >> >>
> >> >> >> Yes, that might be a good solution.
> >> >> >>
> >> >> >> > I currently check if the docstrings of the documentation for the
> >> >> >> > Python interface is equal to the docstrings of the DOLFIN module after
> >> >> >> > import so that sort of works in the same way, only in this case I know
> >> >> >> > that the docstring I check belongs to function 'bar' of class 'foo'.
> >> >> >> >
> >> >> >> > Then we use the stub-generator that you have know to give us the first
> >> >> >> > set of *.rst files and then add the '///' comments check to the
> >> >> >> > verify_cpp_documentation.py script.
> >> >> >>
> >> >> >> It's almost there now, I just need to do some polishing.
> >> >> >>
> >> >> >> Sphinx is currently crashing when it generates the documentation from
> >> >> >> the .rst files I generate.
> >> >> >>
> >> >> >> Exception occurred:
> >> >> >>   File "/usr/lib/pymodules/python2.6/docutils/nodes.py", line 1898, in
> >> >> >>   dupname
> >> >> >>     node['names'].remove(name)
> >> >> >> ValueError: list.remove(x): x not in list
> >> >> >>
> >> >> >> Any ideas what this might be?
> >> >> >
> >> >> > Looks like this happens when there are multiple functions with the
> >> >> > same signature.
> >> >>
> >> >> Very likely,  and that's probably because you need to extract 'const'
> >> >> information too, and that's just the tip of the iceberg if we proceed
> >> >> down this road....
> >> >
> >> > Try now.
> >> >
> >> > You need to set DOLFIN_DIR to the DOLFIN source tree.
> >> >
> >> > Then run
> >> >
> >> >  python utils/generate_cpp_doc.py
> >> >  make html
> >> >
> >> > The generated stuff is in {source/build}/programmers-reference/test/cpp
> >>
> >> OK, I'm just finishing a DOLFIN build to test the docstrings in the
> >> Python interface. Will test soon.
> >>
> >> > I'll be moving it to {source/build}/programmers-reference/cpp and make
> >> > sure not to overwrite the Mesh and Point class documentation that you
> >> > have written.
> >>
> >> There is no C++ documentation for Point, only for the Python interface
> >> and that was just to see how some of the autodoc functions worked.
> >> Anyway, we can always dig it up by reverting the repo to hack away.
> >
> > I noticed that. I just remember seeing something about the Point
> > class.
> >
> > Anyway, it seems to work now. What is missing is to generate the
> > index.rst files for each module.
>
> Looks pretty good to me. Do you need to generate the index.rst files?
> Can't you just add the output from 'ls *.h' in the modules to the
> index.rst files?
> Once we're finished editing the *.rst files you have generated we
> should be able to run the script verify_cpp_documentation.py which
> should tell us if we missed any.
> BTW, I'm done for today.

I think we should think hard on this one more time. Is it really that
bad do write the documentation as part of the code?

The stuff that you have written for the Mesh class could easily go in
to Mesh.h without causing too much clutter (reST looks nice), and I
imagine it would be easy to add a folding mode to Emacs and other
editors that will hide all lines starting with /// except for the
first line.

The simple script I wrote seems to work pretty well to extract the
documentation. If it breaks somewhere, we could either improve the
script or learn to write the code so the script does not break.

The point here is that now the generated .rst files are in sync with
the code, but in a day or two someone will edit one of the .h files in
DOLFIN and the documentation and code will start to diverge.

--
Anders



Follow ups

References