fenics team mailing list archive

Thread
Date
Re: Generation of docstring module

To: Johan Hake <johan.hake@xxxxxxxxx>
From: Anders Logg <logg@xxxxxxxxx>
Date: Tue, 7 Sep 2010 18:22:03 +0200
Cc: fenics@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201009070859.33195.johan.hake@gmail.com>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Sep 07, 2010 at 08:59:32AM -0700, Johan Hake wrote:
> [snip]
>
> > > But how do we extract the different arguments? I suppose this is
> > > collected by Doxygen, and we just need to parse these and output them in
> > > a correct way?
> >
> > I don't think we need to parse the arguments and output them. We just
> > get the function name and if we have more than one set of arguments
> > i.e., a different signature we know that we have an overloaded method
> > and how to handle it.
>
> And I guess the almighty generate_cpp_documentation.py script are able to
> extract the argument information?

It is a very simple script that just reads the header files line by
line, interprets lines starting with "///" as comments and the
following line(s) as a function signature (or class) until it finds
"{" or ";".

Getting the arguments would just be

  [arg.strip() for arg in signature.split("(")[1].split(")")[0].split(",")]

My point is that writing the documentation extraction script is not
exactly rocket science.

(Someone more well-versed than me in regexp can probably make some
improvements but 'split' goes a long way.)

> > The arguments should be described in the *Arguments* section of the
> > individual docstring with links to classes formatted like
> > _MeshEntity_, which we will substitute with :py:class:`MeshEntity` or
> >
> > :cpp:class:`MeshEntity` depending on which interface we document.
>
> Ok, but we only want that once for each method in python, even if it is
> overloaded?
>
> > Although I just realized that standard C++ stuff like double* which
> > end up as numpy.array etc. should probably be handled.
>
> Yes this part I am a little worried about... But maybe a god handwritten
> lookup table will do the trick? At least for 99% of the cases ;)

I think so too.

--
Anders


> > On a related note:
> > int some_func()
> > and
> > const int some_func() const
> > are different in C++, but in Python we don't have const right?
> > This will simplify the documentation a lot.
>
> Yes, we tend to %ignore all const versions of different methods.
>
> [snap]
>
> > >> >  * Extended methods needs to be handled in one of three ways:
> > >> >    1) Write the docstring directly into the foo_post.i file
> >
> > I like this option, if this is where we have the code for a function,
> > then this is where the docstring should be as it increases the
> > probability of the docstring being up to date.
>
> Ok, lets settle on this one. We also need to make sure that all %extended
> methods in the C++ layer gets a proper docstring. However I am not really sure
> how this can be done :P
>
> [snup]
>
> > > Why do we need to assign to these methods? They already get their
> > > docstrings from the docstrings.i file. However if we want to get rid of
> > > the new_instancemethod assignment above, we can just remove the
> >
> > Some history.
> > Initially, we wanted to have all docstrings separated from the DOLFIN
> > code and collected in the fenics-doc module. The easiest way to get
> > the >>> help(dolfin) docstring correct is to assign to __doc__
> > dynamically.
> > If we could do this we wouldn't even need the docstrings.i file and
> > things would be simple.
> > However, we discovered that this was not possible, and because of that
> > we still need to generate the docstrings.i file.
> > Then, still assuming we wanted to separate docs from code and keeping
> > docstrings in fenics-doc, I thought it would be easier to generate the
> > docstrings.i file from the handwritten docstrings module in
> > fenics-doc.
> > Some methods don't get their docstrings from the docstrings.i file
> > though, so we still need to assign to __doc__ which is the easiest
> > thing to do.
> > Just recently we decided to extract the docstrings from the C++
> > implementation thus moving the docs back into DOLFIN. This makes the
> > docstrings module almost superfluous with the only practical usage is
> > to have documentation for the extended methods defined in the _post.i
> > files but if we put the docstrings directly in the _post.i files we no
> > longer need it.
>
> Ok, then I do not see any reason for a separate docstring module, makes life a
> lite bit easier...
>
> [snep]
>
> > > I am confused. Do you suggest that we just document the extended Python
> > > layer directly in the python module as it is today? Why should we then
> > > dumpt the docstrings in a separate docstring module? So autodoc can have
> > > something to shew on? Couldn't autodoc just shew on the dolfin module
> > > directly?
> >
> > I'm confused too. :) I guess my head has not been properly reset
> > between the changes in documentation strategies.
> > The Sphinx autodoc can only handle one dolfin module, so we need to
> > either import the 'real' one or the docstrings dolfin module.
> > If we can completely remove the need for the docstrings module, then
> > we should of course include the 'real' one.
>
> Ok!
>
> > >> Then programmer's writing the Python
> > >> layer just need to document while they're coding, where they are
> > >> coding just like they do (or should anyways) for the C++ part.
> > >
> > > Still confused why we need a certain docstring module.
> >
> > Maybe we don't.
> >
> > >> >  2) for the extended Python layer in the cpp.py
> > >> >
> > >> > For the rest, and this will be the main part, we rely on parsed
> > >> > docstrings from the headers.
> > >> >
> > >> > The python programmers reference will then be generated based on the
> > >> > actual dolfin module using sphinx and autodoc.
> > >>
> > >> We could/should probably use either the dolfin module or the generated
> > >> docstring module to generate the relevant reST files. Although we
> > >> might need to run some cross-checks with the Doxygen xml to get the
> > >> correct file names where the classes are defined in DOLFIN such that
> > >> we retain the original DOLFIN source tree structure. Otherwise all our
> > >> documentation will end up in cpp.rst which I would hate to navigate
> > >> through as a user.
> > >
> > > This one got to technical for me. Do you say that there is no way to
> > > split the documentation into smaller parts without relying on the c++
> > > module/file structure?
> >
> > But how would you split it?
>
> I do not know. But then I do not know what the generation step can take as
> different inputs.
>
> > It makes sense to keep the classes Mesh
> > and MeshEntity in the mesh/ part of the documentation. Unfortunately,
> > Swig doesn't add info to the classes in the cpp.py module about where
> > they were originally defined. This is why we need to pair it with info
> > from the xml output.
>
> Ok, but say we keep all documentation in one module. If you are able to pair
> the different classes or functions with a module name, or file name you are
> able to create documentation which is structured after this hierarchy?
>
> > >> I vote for using the generated docstrings module for the documentation
> > >> since it should contain all classes even if some HAS_* was not
> > >> switched on, which brings me to the last question, how do we handle
> > >> the case where some ifdefs result in classes not being generated in
> > >> cpp.py? They should still be documented of course.
> > >
> > > I think we are fine if the server that generate the documentation has all
> > > optional packages, so the online documentation is fully up to date.
> >
> > Maybe, but I think I saw somewhere that depending on the ifdefs some
> > names would be different and we need the documentation to be complete
> > regardless of the users installation.
>
> Yes, I think this is most relevant for the different la backends.
>
> > >> Another issue we need to handle is any example code in the C++ docs
> > >> which must be translated into Python syntax. Either automatically, or
> > >> by some looking up in a dictionary, but that brings us right back to
> > >> something < 100% automatic.
> > >
> > > Would it be possible to have just pointers to demos instead of example
> > > code. I know it is common to have example code in Python docstrings but
> > > I do not think it is equally common to have this in C++ header files.
> >
> > Since when did we care about common in FEniCS? :) I think small
> > input/output examples are good even for C++, look at the Mesh class
> > for instance.
>
> Ok, but put it in another way. It does look quite funny with a Python example
> in the C++ header. But if we start putting these in their own lookup file we
> are back again to the separate file we just abandone. We could maybe add some
> more markup for c++ and Python examples directly in the header?
>
> [snop] sorry ran out of vowels...
>
> > > Which should serve us well.
> >
> > OK, looks reasonable. So it might be possible to do this after all.
>
> Well, I think we have to let the almight script have a go first ;)
>
> Johan
>
> _______________________________________________
> Mailing list: https://launchpad.net/~fenics
> Post to     : fenics@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fenics
> More help   : https://help.launchpad.net/ListHelp

--
Anders
References

Generation of docstring module
From: Johan Hake, 2010-09-02
Re: Generation of docstring module
From: Johan Hake, 2010-09-07
Re: Generation of docstring module
From: Kristian Ølgaard, 2010-09-07
Re: Generation of docstring module
From: Johan Hake, 2010-09-07