← Back to team overview

fenics team mailing list archive

Re: Docstrings etc

 

On 27 August 2010 09:39, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
>
>
> On 27/08/10 08:06, Kristian Ølgaard wrote:
>>
>> On 27 August 2010 08:54, Garth N. Wells<gnw20@xxxxxxxxx>  wrote:
>>>
>>>
>>> On 27/08/10 07:43, Anders Logg wrote:
>>>>
>>>> On Thu, Aug 26, 2010 at 10:34:02PM +0200, Kristian Ølgaard wrote:
>>>>>
>>>>> On 26 August 2010 22:13, Anders Logg<logg@xxxxxxxxx>    wrote:
>>>>>>
>>>>>> On Thu, Aug 26, 2010 at 10:09:16PM +0200, Kristian Ølgaard wrote:
>>>>>>>
>>>>>>> On 26 August 2010 22:04, Anders Logg<logg@xxxxxxxxx>    wrote:
>>>>>>>>
>>>>>>>> On Thu, Aug 26, 2010 at 09:34:01PM +0200, Kristian Ølgaard wrote:
>>>>>>>>>
>>>>>>>>> On 26 August 2010 20:35, Anders Logg<logg@xxxxxxxxx>    wrote:
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 26, 2010 at 08:16:41PM +0200, Anders Logg wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Aug 26, 2010 at 08:09:56PM +0200, Kristian Ølgaard wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 26 August 2010 19:51, Anders Logg<logg@xxxxxxxxx>    wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 26, 2010 at 07:42:35PM +0200, Kristian Ølgaard
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 26 August 2010 18:22, Anders Logg<logg@xxxxxxxxx>    wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've thought some more on how to organize/synchronize the
>>>>>>>>>>>>>>> FEniCS
>>>>>>>>>>>>>>> documentation (in fenics-doc) with the documentation we have
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think it is important that
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (1) the strings we have in the code are the same as those
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> appear
>>>>>>>>>>>>>>> on in the HTML documentation (which we write in Sphinx).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (2) the strings we have in the code are short (so they don't
>>>>>>>>>>>>>>> clutter
>>>>>>>>>>>>>>> up the code)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I disagree. The whole idea of the documentation effort was to
>>>>>>>>>>>>>> document
>>>>>>>>>>>>>> in one place
>>>>>>>>>>>>>> (using carefully handwritten and elaborate explanations
>>>>>>>>>>>>>> including
>>>>>>>>>>>>>> examples and links to demos etc.) and code in another.
>>>>>>>>>>>>>> The comments in the code should be very short and precise such
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> together with the class/function definition and type info the
>>>>>>>>>>>>>> developer can complete the task without looking elsewhere.
>>>>>>>>>>>>>> These
>>>>>>>>>>>>>> kind
>>>>>>>>>>>>>> of comments, I expect, will look weird when put next to an
>>>>>>>>>>>>>> elaborate
>>>>>>>>>>>>>> explanation on how the class/function works including all the
>>>>>>>>>>>>>> bells
>>>>>>>>>>>>>> and whistles.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we look at these two, it seems that (1) implies that we
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>> write the documentation as part of the code and then extract
>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>> using
>>>>>>>>>>>>>>> some tool.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But (2) prevents that since we don't want to constrain the
>>>>>>>>>>>>>>> documentation for all functions to be very short.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> How about the following solution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Write short docstrings in the code
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Auto-generate all the .rst input files for the Programmer's
>>>>>>>>>>>>>>>  Reference using a simple Python script that looks for '///'
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * The script looks at the code to generate the signature of
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>  function and the text that comes immediately after.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This might be possible for a simple
>>>>>>>>>>>>>> 'change-order-of-comment-and-function' script where you
>>>>>>>>>>>>>> manipulate the
>>>>>>>>>>>>>> output manually afterwards, but if you want to run this more
>>>>>>>>>>>>>> than once
>>>>>>>>>>>>>> you will have to pick up nested class/struct definitions
>>>>>>>>>>>>>> templates and
>>>>>>>>>>>>>> all kinds of crap.
>>>>>>>>>>>>>> I tried to write a parser like this to check if all classes
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> functions were documented, but gave up and let Doxygen do the
>>>>>>>>>>>>>> dirty
>>>>>>>>>>>>>> work. (But do we want to do this just to generate 20
>>>>>>>>>>>>>> characters
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> docstring automatically?)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  But it also looks in a hand-written .rst file that contains
>>>>>>>>>>>>>>> any
>>>>>>>>>>>>>>>  additional stuff we want to put below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So for the code example in the style manual, the things that
>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>> picked up from the code are
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  // Return the cell which is closest to the given point
>>>>>>>>>>>>>>>  uint closest_cell(const Point&    point) const
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> which gets converted to
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> .. cpp:function:: uint closest_cell(const Point&    point)
>>>>>>>>>>>>>>> const
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    Return the cell which is closest to the given point
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The script also looks in a file for "closest_cell" below
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>> we have
>>>>>>>>>>>>>>> written all the *Arguments* stuff that will be thrown in
>>>>>>>>>>>>>>> below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Will that work?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, but the work flow is getting complex, and you'll need to
>>>>>>>>>>>>>> know
>>>>>>>>>>>>>> what you get from the source code so you don't repeat
>>>>>>>>>>>>>> yourself.
>>>>>>>>>>>>>> It is much easier to have the documentation in one place.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another solution would be to just write everything as part of
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> code, and just add some settings to our editors that will
>>>>>>>>>>>>>>> fold
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> extra stuff away so we don't need to see it. Maybe that is
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> most
>>>>>>>>>>>>>>> robust solution?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The general consensus the last time this issue came up was not
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> clutter the code with documentation markup.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kristian
>>>>>>>>>>>>>
>>>>>>>>>>>>> I agree it's good to have the documentation in one place, but
>>>>>>>>>>>>> it
>>>>>>>>>>>>> would
>>>>>>>>>>>>> be good if we found a way to keep it in sync. Helper scripts
>>>>>>>>>>>>> can
>>>>>>>>>>>>> do
>>>>>>>>>>>>> some of that work, but we probably won't be able to pick up
>>>>>>>>>>>>> things
>>>>>>>>>>>>> like having
>>>>>>>>>>>>>
>>>>>>>>>>>>>  "Compute the number of neighbors"
>>>>>>>>>>>>>
>>>>>>>>>>>>> in one place and
>>>>>>>>>>>>>
>>>>>>>>>>>>>  "Return the number of neighbors"
>>>>>>>>>>>>>
>>>>>>>>>>>>> in other places. Things like this will creep in over time. It
>>>>>>>>>>>>> might
>>>>>>>>>>>>> not be a big issue but I find it a bit annoying.
>>>>>>>>>>>>
>>>>>>>>>>>> I see. A simpler approach, rather than generating docstrings
>>>>>>>>>>>> would
>>>>>>>>>>>> be
>>>>>>>>>>>> to have a script that
>>>>>>>>>>>> simply looks for '///' comments in dolfin/mesh/Mesh.h and check
>>>>>>>>>>>> if
>>>>>>>>>>>> the
>>>>>>>>>>>> EXACT same strings are present in
>>>>>>>>>>>> programmers-reference/cpp/mesh/Mesh.rst, if not crash test and
>>>>>>>>>>>> let
>>>>>>>>>>>> user figure out manually why it failed and which
>>>>>>>>>>>> comment/docstring
>>>>>>>>>>>> should be changed.
>>>>>>>>>>>> This won't be completely bulletproof, but much much simpler than
>>>>>>>>>>>> parsing a C++ library.
>>>>>>>>>>>
>>>>>>>>>>> Yes, that might be a good solution.
>>>>>>>>>>>
>>>>>>>>>>>> I currently check if the docstrings of the documentation for the
>>>>>>>>>>>> Python interface is equal to the docstrings of the DOLFIN module
>>>>>>>>>>>> after
>>>>>>>>>>>> import so that sort of works in the same way, only in this case
>>>>>>>>>>>> I
>>>>>>>>>>>> know
>>>>>>>>>>>> that the docstring I check belongs to function 'bar' of class
>>>>>>>>>>>> 'foo'.
>>>>>>>>>>>>
>>>>>>>>>>>> Then we use the stub-generator that you have know to give us the
>>>>>>>>>>>> first
>>>>>>>>>>>> set of *.rst files and then add the '///' comments check to the
>>>>>>>>>>>> verify_cpp_documentation.py script.
>>>>>>>>>>>
>>>>>>>>>>> It's almost there now, I just need to do some polishing.
>>>>>>>>>>>
>>>>>>>>>>> Sphinx is currently crashing when it generates the documentation
>>>>>>>>>>> from
>>>>>>>>>>> the .rst files I generate.
>>>>>>>>>>>
>>>>>>>>>>> Exception occurred:
>>>>>>>>>>>   File "/usr/lib/pymodules/python2.6/docutils/nodes.py", line
>>>>>>>>>>> 1898,
>>>>>>>>>>> in
>>>>>>>>>>>   dupname
>>>>>>>>>>>     node['names'].remove(name)
>>>>>>>>>>> ValueError: list.remove(x): x not in list
>>>>>>>>>>>
>>>>>>>>>>> Any ideas what this might be?
>>>>>>>>>>
>>>>>>>>>> Looks like this happens when there are multiple functions with the
>>>>>>>>>> same signature.
>>>>>>>>>
>>>>>>>>> Very likely,  and that's probably because you need to extract
>>>>>>>>> 'const'
>>>>>>>>> information too, and that's just the tip of the iceberg if we
>>>>>>>>> proceed
>>>>>>>>> down this road....
>>>>>>>>
>>>>>>>> Try now.
>>>>>>>>
>>>>>>>> You need to set DOLFIN_DIR to the DOLFIN source tree.
>>>>>>>>
>>>>>>>> Then run
>>>>>>>>
>>>>>>>>  python utils/generate_cpp_doc.py
>>>>>>>>  make html
>>>>>>>>
>>>>>>>> The generated stuff is in
>>>>>>>> {source/build}/programmers-reference/test/cpp
>>>>>>>
>>>>>>> OK, I'm just finishing a DOLFIN build to test the docstrings in the
>>>>>>> Python interface. Will test soon.
>>>>>>>
>>>>>>>> I'll be moving it to {source/build}/programmers-reference/cpp and
>>>>>>>> make
>>>>>>>> sure not to overwrite the Mesh and Point class documentation that
>>>>>>>> you
>>>>>>>> have written.
>>>>>>>
>>>>>>> There is no C++ documentation for Point, only for the Python
>>>>>>> interface
>>>>>>> and that was just to see how some of the autodoc functions worked.
>>>>>>> Anyway, we can always dig it up by reverting the repo to hack away.
>>>>>>
>>>>>> I noticed that. I just remember seeing something about the Point
>>>>>> class.
>>>>>>
>>>>>> Anyway, it seems to work now. What is missing is to generate the
>>>>>> index.rst files for each module.
>>>>>
>>>>> Looks pretty good to me. Do you need to generate the index.rst files?
>>>>> Can't you just add the output from 'ls *.h' in the modules to the
>>>>> index.rst files?
>>>>> Once we're finished editing the *.rst files you have generated we
>>>>> should be able to run the script verify_cpp_documentation.py which
>>>>> should tell us if we missed any.
>>>>> BTW, I'm done for today.
>>>>
>>>> I think we should think hard on this one more time. Is it really that
>>>> bad do write the documentation as part of the code?
>>>>
>>>
>>> It's good to have it in the code as long as its not too long and not full
>>> of
>>> mark up. The thing I look for most are function declarations, so I find
>>> it
>>> annoying when I can't find a declaration for all the markup. It's also
>>> hard
>>> to get an overview of a class when only a few declarations fit on the
>>> screen
>>> amongst the markup with funny symbols.
>>
>> The markup that we plan to use will be pretty simple (see the source
>> for the C++ Mesh.rst), but it will add a lot of extra lines to the
>> source code.
>> I too would find this annoying.
>>
>>> I do like long docstrings in Python. Because the argument list is not
>>> statically typed and there's more magic in Python, a good docstring is
>>> essential.
>>>
>>> Garth
>>>
>>>> The stuff that you have written for the Mesh class could easily go in
>>>> to Mesh.h without causing too much clutter (reST looks nice), and I
>>>> imagine it would be easy to add a folding mode to Emacs and other
>>>> editors that will hide all lines starting with /// except for the
>>>> first line.
>>>>
>>>> The simple script I wrote seems to work pretty well to extract the
>>>> documentation. If it breaks somewhere, we could either improve the
>>>> script or learn to write the code so the script does not break.
>>>>
>>>> The point here is that now the generated .rst files are in sync with
>>>> the code, but in a day or two someone will edit one of the .h files in
>>>> DOLFIN and the documentation and code will start to diverge.

On second thought, what do you mean by diverge?
I have test scripts in place the checks if a function in *.h is
documented in *.rst, and if a function in *.rst is still present in
*.h.

If you mean the docstrings might change, we can perform the additional
check where we test if the one liner docstring in *.h is present in
the documentation in *.rst, then there can be no divergence and we can
have short comments in the DOLFIN source code.

>> Yes, but this problem is already there for the Python interface and it
>> won't go away.
>> I guess the key thing to this is that a new feature or a change in
>> DOLFIN source code is not complete until the documentation has been
>> updated.
>>
>
> To save ourselves work for now, we could just let doxygen create the C++
> programmers reference and provide a link to it. It doesn't seem very
> sensible that we write our own parser to document the C++ code. With
> doxygen, we also get class diagrams. We can then scan the doxygen
> documentation for each class and improve it iteratively.

Do you mean improve the Doxygen output, or the source  code (*.h
files)? If we improve the output we can get diverging docs and code.

I had a look at the Breathe generated docs from Doxygen, it doesn't
look that great and we won't have all the links from the index page to
the classes.

Kristian

> Garth
>
>> Kristian
>>
>>>> --
>>>> Anders
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~fenics
>>>> Post to     : fenics@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~fenics
>>>> More help   : https://help.launchpad.net/ListHelp
>>>
>



Follow ups

References