Replies: 2 comments
-
Hi @e10e3, good stuff. The first option sounds good to me and reminds me of the first annotation efforts in 2022. Back then, we assigned the team to annotate the code accordingly with familiarity with each module. It is a good way to parallelize efforts. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks for looking back into this @e10e3. I agree that we could enable strict typing progressively on a module basis. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The pitch
River has some type annotations and recommends that new code does not cause new type-checking errors with MyPy.
Not all the code in River is annotated: running MyPy with
--disallow-incomplete-defs --disallow-untyped-defs
raises over 2500 errors. And this does not count the type inconsistencies that MyPy would signal in fully-typed code, but it currently ignores while there are no types.To signal to type checkers that they ship type annotations, River’s modules have a
py.typed
file.The Python guidelines for type checking libraries say, “A 'py.typed' library should aim to be type complete so that type checking and inspection can work to their full extent.”
Indeed, partial typing can hinder efforts to type check programs that depend on the library.
River is not there yet, but we can work towards it!
To incite in having correct and complete type annotations, we need to adapt the configuration in
pyproject.toml
. The issue “Professional-grape mypy conf” (#1430) gives some pointers.The workload is too big for it to be done in a single PR. Thankfully, MyPy is designed for this use case. We can progress until the library is fully covered.
This means regular type-checking with MyPy will continue to work like it does today, without an avalanche of errors, and type checking will be able to progress in parallel.
There are two mains ways of reaching this goal: going module by module or rule-by-rule. They are not incompatible.
Modules
The module way could be to set
strict=true
by default in the configuration and an override withstrict=false
for all pending modules. When a module is fully annotated, remove it from the overrides.A hierarchy of modules should be annotated first can be done as well, depending on how much they are imported (or estimated to be). I propose:
base
metrics
utils
stats
optim
datasets
tree
preprocessing
stream
linear_model
evaluate
drift
compose
bandit
cluster
anomaly
time_series
feature_extraction
ensemble
proba
multioutput
naive_bayes
checks
rules
model_selection
forest
neighbors
sketch
facto
covariance
compat
multiclass
reco
imblearn
feature_selection
misc
active
conf
neural_net
Rules
Another way is to enable progressively stricter options (https://mypy.readthedocs.io/en/stable/existing_code.html#introduce-stricter-options): at first, only the most lenient checks from MyPy are enabled, and stricter options are progressively enabled. This can also be broken to a per-module basis.
MyPy has an example for this in its documentation:
What are your thoughts on this?
I would personally tend to go for the module way, it makes for more localised edits instead of working over the whole codebase again and again.
But as said above, the two approaches are not incompatible.
Beta Was this translation helpful? Give feedback.
All reactions