fenics team mailing list archive

Thread
Date
Re: Generation of docstring module

To: Anders Logg <logg@xxxxxxxxx>
From: Kristian Ølgaard <k.b.oelgaard@xxxxxxxxx>
Date: Tue, 7 Sep 2010 14:56:47 +0200
Cc: fenics@xxxxxxxxxxxxxxxxxxx, johan.hake@xxxxxxxxx
In-reply-to: <20100907103730.GG2000@olorin>
On 7 September 2010 12:37, Anders Logg <logg@xxxxxxxxx> wrote:
> On Tue, Sep 07, 2010 at 12:20:09PM +0200, Kristian Ølgaard wrote:
>> On 7 September 2010 11:04, Anders Logg <logg@xxxxxxxxx> wrote:
>> > On Mon, Sep 06, 2010 at 05:56:13PM +0200, Kristian Ølgaard wrote:
>> >> On 6 September 2010 17:24, Johan Hake <johan.hake@xxxxxxxxx> wrote:
>> >> > On Monday September 6 2010 08:13:44 Anders Logg wrote:
>> >> >> On Mon, Sep 06, 2010 at 08:08:10AM -0700, Johan Hake wrote:
>> >> >> > On Monday September 6 2010 05:47:27 Anders Logg wrote:
>> >> >> > > On Mon, Sep 06, 2010 at 12:19:03PM +0200, Kristian Ølgaard wrote:
>> >> >> > > > > Do we have any functionality in place for handling documentation
>> >> >> > > > > that should be automatically generated from the C++ interface and
>> >> >> > > > > documentation that needs to be added later?
>> >> >> > > >
>> >> >> > > > No, not really.
>> >> >> > >
>> >> >> > > ok.
>> >> >> > >
>> >> >> > > > > I assume that the documentation we write in the C++ header files
>> >> >> > > > > (like Mesh.h) will be the same that appears in Python using
>> >> >> > > > > help(Mesh)?
>> >> >> > > >
>> >> >> > > > Yes and no, the problem is that for instance overloaded methods will
>> >> >> > > > only show the last docstring.
>> >> >> > > > So, the Mesh.__init__.__doc__ will just contain the Mesh(std::str
>> >> >> > > > file_name) docstring.
>> >> >> > >
>> >> >> > > It would not be difficult to make the documentation extraction script
>> >> >> > > we have (in fenics-doc) generate the docstrings module and just
>> >> >> > > concatenate all constructor documentation. We are already doing the
>> >> >> > > parsing so spitting out class Foo: """ etc would be easy. Perhaps that
>> >> >> > > is an option.
>> >> >> >
>> >> >> > There might be other overloaded methods too. We might try to setle on a
>> >> >> > format for these methods, or make this part of the 1% we need to handle
>> >> >> > our self.
>> >> >>
>> >> >> ok. Should also be fairly easy to handle.
>> >> >
>> >> > Ok.
>> >> >
>> >> >> > > > > But in some special cases, we may want to go in and handle
>> >> >> > > > > documentation for special cases where the Python documentation
>> >> >> > > > > needs to be different from the C++ documentation. So there should
>> >> >> > > > > be two different sources for the documentation: one that is
>> >> >> > > > > generated automatically from the C++ header files, and one that
>> >> >> > > > > overwrites or adds documentation for special cases. Is that the
>> >> >> > > > > plan?
>> >> >> > > >
>> >> >> > > > The plan is currently to write the docstrings by hand for the entire
>> >> >> > > > dolfin module. One of the reasons is that we rename/ignores
>> >> >> > > > functions/classes in the *.i files, and if we we try to automate the
>> >> >> > > > docstring generation I think we should make it fully automatic not
>> >> >> > > > just part of it.
>> >> >> > >
>> >> >> > > If we can make it 99% automatic and have an extra file with special
>> >> >> > > cases I think that would be ok.
>> >> >> >
>> >> >> > Agree.
>> >>
>> >> Yes, but we'll need some automated testing to make sure that the 1%
>> >> does not go out of sync with the code.
>> >> Most likely the 1% can't be handled because it is relatively important
>> >> (definitions in *.i files etc.).
>> >
>> > I imagine that "1%" will be the same as the "1%" that we have special
>> > treatment for in the SWIG files anyway, so it makes sense those need
>> > special treatment.
>>
>> I think that we can automate that last 1% too.
>>
>> > So the idea would be:
>> >
>> >  1. Document the C++ code in the C++ header files
>> >  2. Document the extra Python code in the Python files (?)
>> >  3. Document the extra SWIG stuff in a special file
>>
>> All Python docstrings should be located where the code is.
>> In the Python layer (like dolfin/fem.py), or in the extended methods
>> in the *.i files for the dolfin/cpp.py module.
>>
>> We then need to figure out how to change the syntax/name correctly
>> such that std::vector, double* etc. are mapped to the correct Python
>> arguments/return values, and how to handle the *example* code.
>>
>> >> >> > > > Also, we will need to change the syntax in all *example* code of the
>> >> >> > > > docstrings. Maybe it can be done, but I'll need to give it some more
>> >> >> > > > careful thought. We've already changed the approach a few times now,
>> >> >> > > > so I really like the next try to close to our final implementation.
>> >> >> > >
>> >> >> > > I agree. :-)
>> >> >> > >
>> >> >> > > > > Another thing to discuss is the possibility of using Doxygen to
>> >> >> > > > > extract the documentation. We currently have our own script since
>> >> >> > > > > (I assume) Doxygen does not have a C++ --> reST converter. Is that
>> >> >> > > > > correct?
>> >> >> > > >
>> >> >> > > > I don't think Doxygen has any such converter, but there exist a
>> >> >> > > > project http://github.com/michaeljones/breathe
>> >> >> > > > which makes it possible to use xml output from Doxygen in much the
>> >> >> > > > same way as we use autodoc for the Python module. I had a quick go at
>> >> >> > > > it but didn't like the result. No links on the index pages to
>> >> >> > > > function etc. So what we do now is better, but perhaps it would be a
>> >> >> > > > good idea to use Doxygen to extract the docstrings for all classes
>> >> >> > > > and functions, I tried parsing the xml output in the
>> >> >> > > > test/verify_cpp_
>> >> >> > > > ocumentation.py script and it should be relatively
>> >> >> > > > simple to get the docstrings since these are stored as attributes of
>> >> >> > > > classes/functions.
>> >> >> > >
>> >> >> > > Perhaps an idea would be to use Doxygen for parsing and then have our
>> >> >> > > own script that works with the XML output from Doxygen?
>> >> >> >
>> >> >> > I did not know we allready used Doxygen to extract information about
>> >> >> > class structure from the headers.
>> >> >>
>> >> >> I thought it was you who implemented the Doxygen documentation extraction?
>> >> >
>> >> > Duh... I mean that I did not know we used it in fenics_doc, in
>> >> > verify_cpp_documentation.py.
>> >>
>> >> We don't. I wrote this script to be able to test the documentation in
>> >> *.rst files against dolfin.
>> >> Basically, I parse all files and keep track of the classes/functions
>> >> which are defined in dolfin and try to match those up against the
>> >> definitions in the documentation (and vise versa) to catch
>> >> missing/obsolete documentation.
>> >>
>> >> >> > What are the differences between using the XML from Doxygen to also
>> >> >> > extract the documentation, and the approach we use today?
>> >> >>
>> >> >> Pros (of using Doxygen):
>> >> >>
>> >> >>   - Doxygen is developed by people that presumably are very good at
>> >> >>     extracting docs from C++ code
>> >> >>
>> >> >>   - Doxygen might handle some corner cases we can't handle?
>> >>
>> >> Definitely, and we don't have to maintain it.
>> >
>> > We would need to maintain the script that extracts data from the
>> > Doxygen-generated XML files.
>> >
>> >> >> Cons (of using Doxygen):
>> >> >>
>> >> >>   - Another dependency
>> >> >
>> >> > Which we already have.
>> >> >
>> >> >>   - We still need to write a script to parse the XML
>> >> >
>> >> > We should be able to ust the xml parser in docstringgenerator.py.
>> >> >
>> >> >>   - The parsing of /// stuff from C++ code is very simple
>> >> >
>> >> > Yes, and this might be just fine. But if it grows we might consider using
>> >> > Doxygen.
>> >>
>> >> But some cases are not handled correctly already (nested classes etc.)
>> >> so I vote for Doxygen.
>> >
>> > Not that I'm insisting on not using Doxygen, but isn't it quite rare
>> > that we use nested classes? I think we decided at some point that we
>> > wanted to avoid it for some other reason. I don't remember which but
>> > it might have been a SWIG problem.
>>
>> Look at http://www.fenics.org/newdoc/programmers-reference/cpp/function/Function.html
>> as a user I would be confused by LocalScratch and GatherScratch.
>
> Those can be easily fixed by letting the script stop parsing when it
> finds "private:".

OK, and if we are sure that no other nested classes are present in
DOLFIN I guess things should be fine.

>> The documentation here is also rather confusing, yes we can fix it,
>> but similar cases will arise in the future.
>>
>> http://www.fenics.org/newdoc/programmers-reference/cpp/mesh/MeshPrimitive.html
>
> That looks strange because Andre has used an arbitrary mix of "//" and
> "///" in his comments. Don't blame my script for that. :-)

Alright alright, I'll never question the almighty
generate_cpp_documentation.py script again. :)

In light of the above and the Doxygen line break issue, maybe it's
best to use your script as a first try?
We just need to break it up in parsing (intermediate representation),
modifying (C++ and Python syntax) and writing stages (dump in
respective folders in the documentation) and settle on the
intermediate representation such that we can easily switch to a
Doxygen parser in case we decide to.

Kristian

> --
> Anders
>
Follow ups

Re: Generation of docstring module
From: Anders Logg, 2010-09-07
References

Generation of docstring module
From: Johan Hake, 2010-09-02
Re: Generation of docstring module
From: Johan Hake, 2010-09-06
Re: Generation of docstring module
From: Anders Logg, 2010-09-06
Re: Generation of docstring module
From: Johan Hake, 2010-09-06
Re: Generation of docstring module
From: Kristian Ølgaard, 2010-09-06
Re: Generation of docstring module
From: Anders Logg, 2010-09-07
Re: Generation of docstring module
From: Kristian Ølgaard, 2010-09-07
Re: Generation of docstring module
From: Anders Logg, 2010-09-07