-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make all docstrings Sphinx readable #84
Comments
@appukuttan-shailesh This is the issue I was referring to in the call. We can coordinate here unless you have an open issue that you would prefer to use. I guess the first thing to do is coordinate on a docstring format? What does HBP use internally? |
@rgerkin : Sure, we can use ticket this to coordinate our ideas. sphinx.ext.napoleon is a handy sphinx extension for use with this format. Like I mentioned on the call, I have been thinking of making a custom Sphinx extension to help streamline the documentation of sciunit based validation test units. From what I have explored recently, this might not be very trivial a task (mainly because I lack any experience with this). But I still intend to give it a go, and see if it seems worth pursuing. Shall keep you posted here (though progress might be a bit slow). |
What do you mean by "streamline the documentation of sciunit based validation test units." I prefer Google docstring style -- do you think it will be important that we all use the same format? Would it be helpful to use type hints? I could imagine that these could be exploited to avoid specifying expected types in the docstrings themselves, as well as for unit testing and other purposes. |
My intention is to have a Sphinx extension managing SciUnit entities such as Capabilities, Test, Test Suites, Models, Scores etc by adding SciUnit specific 'directives'. For lack of a better example at the moment... something similar to sphinxcontrib.httpdomain that make it simpler and more efficient to document REST APIs. I don't think it is necessary to enforce a single docstring style, and it might actually be wise to allow developers that freedom of choice. AFAIK, the compiled output would appear the same, and that's what matters. I was actually not aware of 'type hints', and so started reading up on it following your post. I do agree that it might be useful. Sphinx certainly can exploit this via extensions such as https://pypi.org/project/sphinx-autodoc-typehints/. Let's go ahead with this. |
On Thursday we will have a lab hackathon and one or more of us will work on updating our docstrings. There are also some attributes that could be used with some of the directives you propose, for example I have attributes for |
@appukuttan-shailesh If you aren't doing anything late Thursday (in your time zone; we will start Thursday morning in ours) you should join us. |
Sounds interesting! Can you tell me about the schedule for Thursday? We are in the middle of pretty disruptive transportation protests here in Paris, and so very few options for commuting to work. It gets particularly bad on the way back. So I will try to join in, maybe atleast for a bit, based on the timing. |
We will be working from 9am-4pm in our time zone, which is 5pm-midnight for you. No set schedule yet, as we usually focus on a few issues (like this one) and discuss them and then start working. If you are around near the beginning maybe we could have a Skype or Hangouts call to discuss. |
Sure, sounds good! I think I can be around for the first 1-1.5 hours at the least. We could use hangouts like for our Monday conference calls. |
@appukuttan-shailesh @ChihweiLHBird So my current understanding, based on the last HBP call, is that you intend to build a sphinx extension which will create new docstrings from a combination of A) original docstrings in the .py files and B) automatic analysis of attributes, methods, etc. of the corresponding classes. These new docstrings would not be inserted into code anywhere, but they would be the basis of the html documentation. So someone going to e.g. readthedocs would see the new, rich docstring, while a developer working with the code would still see the simpler docstring in (A). If that is so, then we should try to define what belongs in (A), and assume that everything else can either be computed by (B), or is unimportant. In some cases, something might in principle be computable by (B), but we would still want a developer browsing the code to see it in (A) anyway, because it so essential. So we could further break this down into:
If we populate these 4 categories with various kinds of information, that should tell us what the scope of the extension should be, provides guidelines for writing SciUnit classes, and provides guidelines for filling in manual docstrings. |
That's a good idea. I agree that the above setup will help in identifying what needs to be documented, and also how and where. You are correct about:
But after exploring a bit, I think we might infact not have to undertake the more arduous task of developing a Sphinx extension, but rather can use the concept of |
Creating a separate post to classify the documentation bits into categories you proposed above. Type A: Things that must go into the original docstring, because only humans could write them (e.g. plain language descriptions of things). Type B: Things that should go into the original docstring (even though they can be computed from code), because developers deserve to see them while browsing code in an IDE. Type C: Things that should not go into original docstring, because the developer does not need to see it in their IDE (and can rely on the html documentation instead), but should be inserted into the new docstring programmatically by the extension. Some of these may require new conventions (like type hints). Type D: Things that should not go into the original docstring nor be inserted into the new docstring. Unclassified: Temporary list to hold parameters that we are undecided about the above category (just to ensure we don't omit any parameter that we come across). Also, we probably should identify the main entities that require to be documented. I can list: TestType A:
Type B:
Type C:
Type D:
Unclassified:
CapabilityType A:
Type B:
Type C:
Type D:
Unclassified:
ScoreType A:
Type B:
Type C:
Type D:
Unclassified:
I believe you would be able to edit this post to add info. |
I haven't been able to make complete this task before going on holidays, and so might have this delayed by a couple of weeks. But I have been able to implement a dummy proof of concept of what is required (using metaclass to auto-generate docs from specified parameters), so I am confident of having it implemented for our particular use case quite soon. Alongside, I have developed a very early stage documentation for our 'hippounit' package: I would be back after mid-January. In the meantime, since @ChihweiLHBird is currently updating docs for sciunit (and neuronunit?), I would be happy if he wishes to undertake the classification task discussed earlier. I shall be keeping an eye on this ticket if we need to have any discussions. |
Of course, I will try to undertake the classification task discussed earlier, but I am still trying to understand many parts of this project. |
My understanding is that entities are including classes, methods, and variables? Am I correct? |
@appukuttan-shailesh Could you point me to an example of the metaclass as used to generate documentation? Maybe in hippounit? @ChihweiLHBird The entities here are all SciUnit classes: Shailesh mentioned To reduce the scope for now, we can stick to only documenting all the kinds of these classes, for example all of the |
What about the classes that inherit from |
@ChihweiLHBird Yes, all classes that inherit from those are the ones that we want to document. |
@appukuttan-shailesh I agree with your categories (Type A-D and unclassified). I'm not certain which things below where yet, but I believe we can build the infrastructure around the need to support those types. Where does that stand? |
@rgerkin, @ChihweiLHBird : Just got back to work today. Wasn't able to devote any time to this while on break. Shall provide an example of using the metaclass to generate documentation early next week. |
Apologies for the long delay! I accidentally upgraded my |
I have made some headway on this task. But it is ending up requiring way more time and effort than anticipated. I am unlikely to be able to set aside much dedicated time for this task before the end of March/April, and so will summarize what's been done along with other findings, so that someone else can run with it in the meantime (I also hope to chip away at it whenever possible). I started testing on a sample test unit (named DemoUnit), rather than on HippoUnit - just to keep things as simple as possible. The test unit consists of four tests and two capabilities. The documentation generated for it (process described below) can be accessed at: Below I will summarize what my objectives were, and the work in that direction: 1) Identifying modules to be documented [complete]I felt each test unit should have dedicated pages for: For this I decided to use Jinja2 and created templates for these pages (DemoUnit doesn't create any new score type, hence no page for it): I didn't want us to have to manually specify the list of tests, capabilities and score to be documented, as this could evolve over the course of development of the test unit. We want the docs to auto-update without requiring manual intervention, whenever there is a change in the repository. The intention was to be able to submit to the above templates a list of the classes that needed to be documented, and have the documentation for them auto-generated from their existing docstrings (how to extend these docstrings is discussed in the last section). This should ideally have been possible, and relatively straightforward to do with the Sphinx extension named sphinx.ext.autosummary. But sadly that didn't work out quite as expected (more details here). I then found a workaround using another extension named sphinx_automodapi.automodapi. The difference was that this package had an option of specifying what entries needed to be skipped (as opposed to what was to be included; I have created a ticket asking for the conf.py was therefore created such that, apart from the regular Sphinx docs necessary code, it identifies all unnecessary entries (see lines 389 to 401), and supplies this info to their respective templates as items to be skipped. The end result being that all the "tests" and "capabilities" defined in the package were documented. Note: the actual code that does this identification would need to be tweaked depending on the structure of the test package. This worked well and was implemented for documenting both DemoUnit and HippoUnit. Each relevant class (test / capability / score) now has a dedicated stub page documenting the class methods and attributes, along with a block diagram showing the inheritance diagram involved with the sub-modules. Some known problems here are:
Notice, for example, that "Attributes Summary" section here, does not list Other extensions worth exploring are: 2) Use of type hints [complete?]As discussed previously, I explored the possibility of incorporating type hints in the source code and using this to contribute to the docs. I found a package named MonkeyType which does a pretty decent job of adding type hints to methods/functions in your existing source code (simply as an alternative to doing this manually). I employed the same and generated type hints for DemoUnit's tests and capabilities source code. The limitation of using MonkeyType is that it doesn't handle class/instance attributes, and this had to be done manually. Once the source code was updated, the next step was to see how Sphinx could be made to use this info to generate documentation info. From online forums, I found that sphinx-autodoc-typehints is a popular option towards this, and I employed this for DemoUnit documentation. As an example, consider the source code:
Docs generated without sphinx-autodoc-typehints: Docs generated with sphinx-autodoc-typehints: In my opinion, this does a pretty good job of documenting the parameter types from the type hints. Adding parameter descriptions to the method's docstrings (such as for 3) Extend docstrings with info from source code [incomplete]One other objective we had was to be able to extract as much info as possible from the source code towards generating the documentation. After speaking with Andrew, I came to know that the concept of metaclasses could be used to achieve this. To explore how this could be used, I created a metaclass inside SciUnit's base.py. This metaclass is inherited, for now (based on DemoUnit's requirements), by SciUnit's Capability class and Test class. In future, this could also be inherited by the Score class, and any others as deemed appropriate. The implementation and functionality of the metaclass can be modified as appropriate in future. As of now, this metaclass does the following: a) It tries to identify "unimplemented" capability methods, so that the documentation can highlight the same to the user. This currently produces output such as (notice 'Note'): b) Have class methods and attributes inherit doc info from parent classes. I had incorporated an init() method in the metaclass to handle this for the class methods. But I later realized that Sphinx>=1.7 automatically inherits docstrings for the same. This method is therefore unnecessary in its current form. Unfortunately, from preliminary exploration, it appears that class attributes don't inherit docstrings similarly. For example, SciUnit's Test class defines docstring for attribute
But if the source code for DemoUnit.test.RestingPotential only states:
then the output appears as:
Therefore we need to explore how we can have attribute docstrings to inherit from base classes. c) Since Sphinx or its extensions (that I tested) were not able to document all the class attributes (including those inherited from parent classes), I attempted to have them manually fed into the class docstrings. This is implemented in the Example output without this docstring extension: Example output with this docstring extension: This shows that it certainly holds potential to further customize the class docstrings using info from the source code. Currently, one downside is the duplication of certain info between this extended doc info and the auto-generated "Attributes Summary" (see Let me know if you have any questions. |
@appukuttan-shailesh Thank you for the very detailed explanation. I think that @ChihweiLHBird and I can work on this for a while. |
This will allow the auto-generated API documentation to contain more information.
The text was updated successfully, but these errors were encountered: