Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show __init__ exports #39

Open
JP-Ellis-KPMG opened this issue Sep 12, 2022 · 13 comments
Open

Show __init__ exports #39

JP-Ellis-KPMG opened this issue Sep 12, 2022 · 13 comments
Labels
feature New feature or request fund Issue priority can be boosted insiders Candidate for Insiders

Comments

@JP-Ellis-KPMG
Copy link

JP-Ellis-KPMG commented Sep 12, 2022

Problem

The __init__.py is sometimes used to re-export functions and classes defined within sub-modules, generally for visibility within the library and also with the auto-completion. It would be nice if the documentation showed this as well.

Solution

Ideally, I would like an option that shows explicit imports/re-exports located within __init__.py. Based on other linting conventions, the way this is done is to be explicit with an as even if it is tautological, though may not always be.

# Explicit import and re-export without changing the name
from .foo import Bar as Bar
# Explicit import and re-export with a name change
from .foo import Baz as FooBaz

As for the implementation, I like the way Rust documentation shows re-exports in a section dedicated for re-exports. This makes it clear that the class or function is defined elsewhere but has been made available within the current namespace. Here's an example from the popular Clap library:

https://docs.rs/clap/latest/clap/#reexports

Alternatives

I tried looking at the options available, but could not identify anything that would provide the functionality I need.

Another alternative would be to have the gen-files script handle this, but the implementation does not appear to be easy.

Extra

The ability to document re-exports I think could be implemented with the following options if there might be a need for that, though I struggle to see the use-case for any of them other than explicit-init.

  • explicit-init: This would only document explicit re-exports (i.e., ones that use the as notation) within __init__.py files.
  • explicit-all: This would document explicit re-exports, but within any Python file.
  • implicit-init: This would document all imports (whether they use the explicit as notation or not), but only within __init__.py
  • implicit-all: As with implicit-init, but for all files.

Boost priority

  • Boost priority in our backlog through Polar.sh. Higher pledge, higher priority.
  • Minimum pledge by user/organization is $5, minimum amount for boost is $30.
  • View all issues with pledges.
  • We receive the funds once the issue is completed and confirmed by you.
  • Features with the insiders label are released to sponsors first, and tied to a funding goal.
Fund with Polar
@pawamoy pawamoy added the feature New feature or request label Mar 5, 2023
@llucax
Copy link

llucax commented May 23, 2024

BTW, there are official rules for what's considered a re-exporting of symbols, which is followed by mypy and pyright for example: https://github.com/python/typing/blob/main/docs/source/libraries.rst#library-interface-public-and-private-symbols

@llucax
Copy link

llucax commented May 23, 2024

This applies not only to __init__ though, so not sure if I should create a separate issue or not.

@llucax
Copy link

llucax commented May 29, 2024

There was also some parallel discussion in gitter.

@pawamoy said:

Hey :) indeed currently Griffe only marks as "exported" the names listed in __all__. When checking if an object is considered public, these "exports" are then used to tell whether an object is explicitly exported. If there are no exports (__all__ is undefined), Griffe falls back on an implicit check, and the mere presence of an object is enough (given it is not private, i.e. it doesn't start with _).

I initially decided not to consider "from x import a as a" as explicitly exported, because:

  1. I don't like it :D
  2. It's ugly :D
  3. More seriously, we have one obvious way: __all__.
  4. "as" imports makes you noqa: F401 the imports, while __all__ is understood. Maybe linters can support "as" imports though

In any case, I didn't think of everything, so this can be revisited / improved. I don't obviously agree with Mypy's rules though. Mostly, I want to have something that makes sense, without confusing users (too much?) :)

So I have mainly 3 reasons why I think this should be supported despite your reasons:

  1. It is not only a mypy thing, it is part of the typing PEP484 (even if it is mentioned only for stub files), this is an official thing, ignoring it doesn't make it go away.
  2. If the goal is not confusing users, not implementing it doesn't help that goal, I was very confused that my symbols weren't shown in the docs when I re-exported it in an officially supported way :)
  3. There is one case that, AFAIK, can't be handled gracefully by using __all__: from x import *. For __init__.py this makes a lot of sense in many cases, having to explicitly import and re-export every symbol in every sub-module is not only pure bloat, it is also error-prone, as if you create more symbols in a sub-module, you need to remember to go to the __init__.py and re-export it. Unless I'm missing some technique to deal with this case more gracefully, in which case I would love to hear about it, if not as a workaround until from x import * is supported by mkdocstrings ;)

About other linters not supporting it, yes, I agree it is annoying, but because of the reasons above, I think they should be, and I'm willing to open issues in flake8 and pylint if they can't deal with this properly.

@pawamoy
Copy link
Member

pawamoy commented May 29, 2024

Thanks for all the info!

Just to play devil's advocate a bit (don't read it as "I really don't want to support as/* imports", it's just to bring nuance):

https://github.com/python/typing/blob/main/docs/source/libraries.rst#library-interface-public-and-private-symbols
Type checkers can use the following rules to determine which symbols are visible outside of the package.

Emphasis on "can". To me it sounds like type checkers have a choice in how they consider an object to be public/exported or not. If they want, they can follow these few rules described in the document, but apparently they don't have to.

https://peps.python.org/pep-0484/#stub-files

This explicitly says it's for stub files. The only mention of non-stub files is:

Just like in normal Python files, submodules automatically become exported attributes of their parent module when imported.

So to me, this again is not an authoritative document to describe how to determine whether an object is public/exported or not. I don't think such a document exist yet.


Anyway. Even if we don't have specs, we have conventions, and from a import x as x and from b import * seem to be rather common conventions to mark x and all public objects from b as public. So let's try and support them 🙂

About your third point, there is a way: if you define __all__ in your submodules, you can import it to extend a parent module __all__ list. Example:

package/
  __init__.py
  module.py
# module.py
__all__ = ["X", "Y", "Z"]
...
# __init__.py

# __all__ was by the way specifically made for wildcard imports 
from module import *

# we could also avoid using the wildcard import,
# but I understand it's less convenient: from module import X, Y, Z

# import module's __all__!
from module import __all__ as _module_all

# use it to extend your parent __all__
__all__ = ["A", "B", "C", *_module_all]

# other operations are supported like:
# __all__ += _module_all

About the error-proneness of having to remember to explicitly import and add to __all__, I'd argue that automatically importing and re-exporting everything with wildcard imports is even more error-prone, and much less clean from a public API point of view. I've seen monsters while working on Griffe, believe me. Explicit is better than implicit 😊?

Anyway, I think wildcard imports are OK when combined with __all__.

@pawamoy
Copy link
Member

pawamoy commented May 29, 2024

One more argument in favor of using __all__ when using wildcard imports.

If you use wildcard imports without __all__ to restrict the set of objects made available in the importing module, you will expose more objects than what the conventions suggest. The convention is only respected by static analysis tools, not runtime interpreters!

package/
  __init__.py
  module.py
import third_party  # by convention, not exported

class A:  # by convention, exported
    ...
# __init__.py
# by convention, only supposed to export A,
# but at runtime, third_party is exposed too!
from module import *

@llucax
Copy link

llucax commented May 29, 2024

Anyway. Even if we don't have specs, we have conventions, and from a import x as x and from b import * seem to be rather common conventions to mark x and all public objects from b as public. So let's try and support them 🙂

🎉

About your third point, there is a way: if you define __all__ in your submodules, you can import it to extend a parent module __all__ list.

Yeah, I thought of that, but that just means now I need to move all the duplication to the submodules, not that I get rid of the duplication. I can already mark symbols private in submodules by prefixing them with _, a very widely-used convention (and also supported by all major type-checkers and linters).

If I can't get rid of the duplication, I rather have it in the __init__.py file which is much more meta and avoid noise in the files that actually have the useful code.

But thanks for the suggestion!

About the error-proneness of having to remember to explicitly import and add to all, I'd argue that automatically importing and re-exporting everything with wildcard imports is even more error-prone, and much less clean from a public API point of view. I've seen monsters while working on Griffe, believe me. Explicit is better than implicit 😊?

I agree and can see that for monster legacy projects doing this is madness, but for a small greenfield project trying to use modern Python features, I think for __init__.py files, it makes a lot of sense and makes things much simpler, again, just mark anything you don't want to export in a module as private and that's it, that is also better than having everything public anyway. If something is not intended to be public it shouldn't be in the first place :)

@llucax
Copy link

llucax commented May 29, 2024

If you use wildcard imports without __all__ to restrict the set of objects made available in the importing module, you will expose more objects than what the conventions suggest. The convention is only respected by static analysis tools, not runtime interpreters!

True. For the cases where you use a type-checker, you code will never run in runtime, so that issue should never happen too. This mostly applies to all static checks performed by a type-checker, unless you add runtime ininstance() checks for every type hint you use, your code will still explode at runtime, so even when I see your point, when using type checkers I see this as a non-issue (maybe in terms of performance, I'm not sure if doing import * could have some implication.

In any case, if this feature is controversial, or could be useful only to a subset of users, maybe it can be added behind an option to let users decide what's best for them.

@pawamoy
Copy link
Member

pawamoy commented May 29, 2024

Ha, this is not an easy topic. So many things to consider here.

Let me try to start again. First I'll describe what Griffe currently does. Then I'll share a few observations.


Griffe has a concept of "exported member". A module member is considered:

  • explicitly exported when its name appears in the module's __all__
  • implicitly exported when the module's __all__ is undefined, the member is available at runtime (for example not guarded behind if TYPE_CHECKING), and the member is an alias (defined elsewhere), except if it's a submodule, unless it was also imported in the current module

A class member is considered implicitly exported, the same way as module members are considered implicitly exported. It is never considered explicitly exported because we don't/can't use __all__ in class bodies.

This "exported" marker is then used in a few places, 3 of of them being of interest to us here.

1. when expanding wildcard imports

  • if __all__ is defined in the wildcard imported module, add all objects referenced in the imported module's __all__ to the importing module
  • else, add all objects that are implicitly exported by the imported module to the importing module

I see a first issue here: members starting with _ are considered implicitly exported. This is wrong: actual wildcard imports will not import these objects at runtime.

2. when checking for breaking changes in the API

Between the same old and same new module, if a member disappeared, and it was either a submodule, or it was exported (explicitly or implicitly), we consider its disappearance a breaking change. Again, it's wrong here to consider that a private object's disappearance makes a breaking change. I didn't spot this issue before because the code that checks breaking changes has an earlier condition on the object names: it skips private objects anyway.

3. when telling if an object is public

Griffe objects have an is_public() method that tell if an object is public or not. An object is:

  • strictly public if:
    • its public attribute is true
    • or it is explicitly exported
  • public if:
    • it is strictly public
    • or it's not private (name not starting with _) and it's not imported

Griffe gained this method later on, to be used by mkdocstrings-python.


A few observations.

I think Griffe's internals/API is a bit confusing. "Exported" is usually used to say "public" (we export an object to make it public), but in Griffe it's rather used to say "the object is exposed in regard to the wildcard import runtime mechanism". So maybe I should rename "exported" to "wildcard exposed".

Then we should fix the issue mentioned above about private members being considered implicitly exposed. And since wildcard imports are not configurable, we would drop the distinction between explicitly and implicitly exposed. An object is either exposed, or it's not.

Finally the code that checks breaking changes should use the "public" API instead of the "exposed" one.


Now to go back to the subject at hand: whether to support the two conventions from a import x as x and from a import *. These conventions ask us to consider x as public, and all objects imported from a (following regular wildcard import mechanisms) public too, respectively.

Thanks to the above analysis, and the changes we will make, I think it is safe to support them 👍

  • from a import x as x was previously checked by, lets call it the Breaking Changes Detector (BCD), because it was "implicitly exported". Now it will keep being checked because it will be considered public ✅
  • same for from a import _x as _x (previously worked for wrong reasons) ✅
  • with from a import *, the object expansion stays the same (though fixed not to include private members when __all__ is undefined), and the BCD just applies its logic to all expanded objects ✅

My comment above about convention vs runtime was just me being confused, or the official docs being unclear. "Imported by wildcard" and "considered public" are not the same thing, and we most probably don't ask type checkers to use the "public" logic when expanding wildcard imports.

Just a final note: Griffe is not a type-checker, so type-checker recommendations do not always apply to us. In this case it seems like we can apply them safely 👍

@llucax
Copy link

llucax commented May 29, 2024

Thanks for the very in-depth analysis! Indeed it seems like there is a lot to consider, but I'm glad it seems like these conventions fit into the current Griffe model. In particular I'm not sure I understand the difference between public, exported and exposed are this context.

Also now I'm curious about what this BCD you are talking about is! Is is a tool that one can use to check if some changes break the public interface? Or is it just some internal implementation detail from Griffe that's not exposed to "regular users"?

Just a final note: Griffe is not a type-checker, so type-checker recommendations do not always apply to us. In this case it seems like we can apply them safely 👍

I understand technically this is true, but if your documentation doesn't match what the type checker thinks, this documentation won't be all that useful. 😇

@pawamoy
Copy link
Member

pawamoy commented May 29, 2024

I understand technically this is true, but if your documentation doesn't match what the type checker things, this documentation won't be all that useful. 😇

Of course 😄 (Just to give one example, type-checkers will apparently only read stubs, while we have to read both regular modules and their stubs, then merge their contents 🙂)

Also now I'm curious about what this BCD you are talking about is! Is is a tool that one can use to check if some changes break the public interface? Or is it just some internal implementation detail from Griffe that's not exposed to "regular users"?

Glad you asked 😎 It's definitely exposed to users, see our docs here: https://mkdocstrings.github.io/griffe/checking/ ☺️

@llucax
Copy link

llucax commented May 30, 2024

This is amazing, thanks a bunch!

@llucax
Copy link

llucax commented May 30, 2024

Just for completeness, here is another more or less official mention to how public and private symbols should work for libraries:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request fund Issue priority can be boosted insiders Candidate for Insiders
Projects
None yet
Development

No branches or pull requests

3 participants