-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework function signatures in docstrings of overloaded functions #2621
base: master
Are you sure you want to change the base?
Conversation
Messing around with this myself, it seems Sphinx is expecting a backslash at the end of the line, if the following is still supposed to be part of the overloaded signature (which is quite ugly, in the raw docstring, but...). Apart from that, things seem to work. |
Nice find! I'm not sure how I missed that before. But a small end-to-end test is working as it should for me now.
You're allowed to have a space before the backslash. I haven't done that here but if you think it looks better with the space there then let me know and I can change it. |
@AWhetter, ah, it does allow that space? Didn't know, but that might be slightly less ugly, yes. I think it still ugly, though (and I don't fully understand why it's necessary; Sphinx could just try parsing the next line and check if it's a signature, just like it does for the first line?), but yes, I've also made a similar ad-hoc change to the pybind11 subtree in my project, now, so thanks for pointing me to this! :-) Is this ready for review? Guess this is only scheduled for 2.7, anyway, but I could go over this after having played around with it myself :-) |
I've just squashed the commits back up, so this is now ready for review. |
We might want @wjakob to weight in on this as well, actually. EDIT: Sorry, scratch all that. This is the case, already, if I look more closely at the implementation. Except that the multiple signatures aren't optional :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm concerned, this is an improvement :-) I'd really like to use it, so, yeah, let's add it to the PRs to be merged for 2.7? (I really need to sleep it seems; it's already part of 2.7)
include/pybind11/pybind11.h
Outdated
else signatures += "\n"; | ||
if (!show_signature_headings && first_user_def) { | ||
first_user_def = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The braces really blow things up here. What about just this?
if (!show_signature_headings && first_user_def)
first_user_def = false;
else
signatures += "\n"
signatures += it->doc;
if (show_signature_headings)
signatures += "\n";
I know braces bring a bit of safety, but in this case, they really make things heavier to read? Just a suggestion; if you insist on braces, that's also fine. At least the if
is not on one line anymore ;-)
include/pybind11/pybind11.h
Outdated
signatures += it->doc; | ||
if (options::show_function_signatures()) signatures += "\n"; | ||
if (show_signature_headings) { | ||
signatures += "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're really feeling adventurous: it seems there's always a newline attached to the end of the docstring if show_signature_headings
? Not worth making the code more complicate for, but maybe this helps restructuring the logic?
If you want, I can have a look, as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code always added a new line. But if we wanted to get rid of it then I think it would make the code more complicated. I'm thinking we would have to do something like
for (auto it = chain_start; it != nullptr; it = it->next) {
bool appending_docstring = (it->doc && strlen(it->doc) > 0 && options::show_user_defined_docstrings());
bool last_section = (it->next == nullptr);
if (show_signature_headings) {
if (index > 0) signatures += "\n";
if (chain)
signatures += std::to_string(++index) + ". ";
signatures += rec->name;
signatures += it->signature;
if (appending_docstring || !last_section)) signatures += "\n";
}
if (appending_docstring) {
// If we're appending another docstring, and aren't printing function signatures, we
// need to append a newline first:
if (!show_signature_headings && first_user_def)
first_user_def = false;
else
signatures += "\n";
signatures += it->doc;
if (show_signature_headings && !last_section) signatures += "\n";
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your call on whether it's worth it. I know the original code did it, but it seems rather silly. I had hoped it would make the code easier/more intuitive. :-(
Related to your comment in #2619: it's a bit of a pity that CPython's docstrings don't have the trailing |
By chance, stubgen works, but only when the signature headings are included. stubgen isn't capable of parsing the backslash separated signatures at the start of the docstring, so it ignores them, but it can parse the section headings. So the original issue I described of stubgen outputting I still think there's value in making the section headings in a docstring optional, but I'm more worried about resolving the issue of stubgen outputting the |
Ugh. I had just thrown out the headings :-(
Agreed. Does it make sense to submit an issue to stubgen or Sphinx (or both, such that they communicate!) to make sure that they act the same? :-/ (i.e., with |
Oh. python/mypy#10019. I should learn how to read. Sorry! (Though asking Sphinx to not require the |
@AWhetter Would it be less intrusive to make sphinx and stubgen support pybind11 overloading syntax? |
It would be less intrusive for pybind, yes. However I'm trying to make everything agree on a convention that is also used outside of pybind, rather than have everything support pybind's way of doing it. You're right that we should be looking for a solution that minimises intrusion though, and I think that @YannickJadoul has identified that solution; get Sphinx to support finding signatures without a backslash, stubgen would require no changes, and get pybind to put function signatures at the beginning of the docstring. |
Ok, that seems reasonable. What about signatures that span multiple lines? Presumably pybind11 would always produce exactly one signature per line, but we are talking about 3rd party tools too. How should it be handled? One can argue for a like-python approach: keep parsing new lines until signature is syntactically complete. Example: foo(x: int) -> Callable[
Optional[Tuple[int, int]]
]
foo(x: str) -> Callable \
[
Optional[Tuple[str, str]]
]
foo() -> None Alternatively, multiline signatures can be disallowed/unsupported, but this seems to be unnecessary restriction. What do you think? |
For that to work I think we would have to change the signature parsing of Sphinx so that it matches what stubgen does where it tokenises that string and properly parses the signatures. At the moment Sphinx is doing a basic regex search on each line of the docstring. It's definitely possible. |
The sphinx choice of trailing backslash as indication for multiple signatures become fairly unfortunate one. It might result into a cumbersome signature parsing implementation which supports both "legacy" and "new" versions... but is doable for sure. I may have a too few clock cycles for that in next months, can't promise, sorry 😞 Probably it's not the issue that requires immediate solution, but still it's worth to think about ahead of time. |
Good news! Sphinx v4 will support the function signatures separated without backslashes. |
Hello, just dropping a little message to see if there is still some interest from maintainers into merging this? I'd personally be interested in getting that to use it on a project I contribute to (https://github.com/PixarAnimationStudios/OpenTimelineIO). Thanks! |
The force push was for a fresh git rebase master. Trying to get the attention of at least one other maintainer. |
The one CI failure is an infrastructure issue (job not even starting up). |
Function signatures are processed by mypy; does it handle these (yet)? I'm pretty sure it handles the list at the top, but not so sure about the ones with mixed signatures and descriptions. |
The signature stuff has mostly been dormant, since we want to try to keep the size down and avoid a massive list of options or making arbitrary “it looks better” changes. Ideally, in my opinion, there’s a clear forward direction - signatures should match what mypy expects. Type checking is well defined by now, and very useful. And mypy now tests pybind11, we should probably do the same, and run mypy’s stubgen on our development branch. |
Henry pointed this out elsewhere: https://github.com/python/mypy/tree/master/test-data/stubgen/pybind11_mypy_demo Pasting here thinking it may be useful for us exploring how to evolve the docstring generation. |
mypy itself does not parse the docstrings. They have to be turned into a stub file first by stubgen. These stub files then get read by mypy. The current behaviour of pybind and stubgen is as follows. py::class_<Pet>(m, "Pet")
.def(py::init<const std::string &, int>())
.def("set", (void (Pet::*)(int)) &Pet::set, "Set the pet's age")
.def("set", (void (Pet::*)(const std::string &)) &Pet::set, "Set the pet's name"); The above pybind class would have the following docstring: >>> help(example.Pet)
class Pet(__builtin__.object)
| Methods defined here:
|
| __init__(...)
| __init__(self: Pet, arg0: str, arg1: int) -> None
|
| set(...)
| set(*args, **kwargs) -> Any
| Overloaded function.
|
| 1. set(self: Pet, arg0: int) -> None
|
| Set the pet's age
|
| 2. set(self: Pet, arg0: str) -> None
|
| Set the pet's name Stubgen would produce a stub file that looks like the following: from typing import overload
class Pet:
def __init__(self: Pet, arg0: str, arg1: int) -> None: ...
@overload
def set(self: Pet, arg0: int) -> None: ...
@overload
def set(self: Pet, arg0: str) -> None: ...
@overload
def set(*args, **kwargs) -> Any: ... As the examples show, the signature at the top of the docstring and the sginatures that are section headings are both picked up. With this change, pybind will instead produce the following docstring (when configured to do so): >>> help(example.Pet)
class Pet(__builtin__.object)
| Methods defined here:
|
| __init__(...)
| __init__(self: Pet, arg0: str, arg1: int) -> None
|
| set(...)
| set(self: Pet, arg0: int) -> None
| set(self: Pet, arg0: str) -> None
| Overloaded function.
|
| 1. set(self: Pet, arg0: int) -> None
|
| Set the pet's age
|
| 2. set(self: Pet, arg0: str) -> None
|
| Set the pet's name Which stubgen would turn into: from typing import overload
class Pet:
def __init__(self: Pet, arg0: str, arg1: int) -> None: ...
@overload
def set(self: Pet, arg0: int) -> None: ...
@overload
def set(self: Pet, arg0: str) -> None: ...
@overload
def set(self: Pet, arg0: int) -> None: ...
@overload
def set(self: Pet, arg0: str) -> None: ... So unfortunately it produces duplicate overload signatures and a change would need to be made to stubgen to either have it remove duplicates or have a way for it to read the docstrings from the head of the docstring only. Having an option to stubgen to have it read only header docstrings would probably be preferred because that's then inline with Sphinx's behaviour as well, but we will need to discuss with the mypy maintainers to reach a conclusion. |
9fa1e69
to
0cc91ca
Compare
I've resolved all conflicts and removed additional options. |
Is there anything that I can do to help progress this change? |
Should I address the conflicts? Is there anything else that I should do with this change to help move it forwards? |
Yes please.
I'll look again later tonight (assuming the conflicts are resolved). |
Thanks for continuing to spend time on this. I've resolved the conflicts. Given that it's been so long since this was submitted I've rebased the commit as well. |
@henryiii is there a chance you could take a look? I believe we should merge this. mypy is only one consumer, and there are issues they haven't responded to for a long time (#4888). I don't think this PR makes anything worse in that respect. The other consumer are humans. Clearly it's useful to see the signature for all overloads. |
I ran this PR through Google's global testing. It passes after rerunning stubgen for ~20 .pyi files (the generated files are checked in, unfortunately). A typical .pyi diff is: @overload
def __setitem__(self, arg0: int, arg1: Program) -> None: ...
@overload
def __setitem__(self, arg0: slice, arg1: ProgramList) -> None: ...
+ @overload
+ def __setitem__(self, arg0: int, arg1: Program) -> None: ...
+ @overload
+ def __setitem__(self, arg0: slice, arg1: ProgramList) -> None: ... I.e. stubgen generates the same signature twice; not surprisingly. I'll have to rerun the global testing with the updated .pyi files. Might take a day to get the results. |
Good news, nothing breaks, but seeing that duplication in the stubgen-generated .pyi files got me to change my mind. I no longer think the duplication is useful. The "generic" signature that is replaced here is far more confusing than helpful. So I did this, based on this PR: #5329 The core diff there is simply: diff --git a/include/pybind11/pybind11.h b/include/pybind11/pybind11.h
index 9f356828..27581348 100644
--- a/include/pybind11/pybind11.h
+++ b/include/pybind11/pybind11.h
@@ -613,10 +613,7 @@ protected:
/* Create a nice pydoc rec including all signatures and
docstrings of the functions in the overload chain */
if (chain && options::show_function_signatures()) {
- // First a generic signature
- signatures += rec->name;
- signatures += "(*args, **kwargs)\n";
- signatures += "Overloaded function.\n\n";
+ signatures += "Overloaded function:\n\n";
}
// Then specific overload signatures
bool first_user_def = true; If have to put that alternative through global testing again. |
#5329 does not address the original issue on its own. The lack of prepended signatures means that Sphinx is unable to parse and document the signatures. The readability is improved though, and the situation for stubgen is also improved.
Do you mean the duplication in the docstring or the duplication in the stubs generated by stubgen?
So if we need to find a compromise, I'd be happy with the following being the default:
and then there being a pybind option to get the first docstring that I mentioned. If the duplication in the stubs is the issue, then I can look at addressing this in stubgen. |
Regarding duplication, what I had in mind is this (docs/advanced/misc.rst in this PR):
The signatures appear twice. For the human readability aspect: Sometimes, quite commonly, signatures are long and unwieldy. I often give up trying to parse them visually. If they appear twice, I give up even quicker, because it just looks even more like a wall of incomprehensible stuff. For stubgen: I'm worried tooling may behave weirdly if the signatures are duplicated. For sphinx: I don't know. Can we find a way to make sphinx happy without duplicating the signatures? If not, improving sphinx would seem the way to go.
Getting into sphinx seems better in that case. But, tangentially, if you could help making progress with the ideas under #4888, that would be great. |
Description
This pull request changes the docstring generation so that all function signatures are prepended to the docstring of an overloaded function, instead of the generic signature that is prepended currently.
It also includes an additional change that allows the section headings in the docstring to be disabled independently of whether or not function signatures are shown. To keep backwards compatibility, calling
disable_function_signatures()
will still turn off both the prepended signatures and the section headings. Therefore section headings will only be shown if both function signatures and section headings are enabled. To rephrase, section headings cannot be enabled independently of disabling function signatures.Closes #2619
Suggested changelog entry: