-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py : type-check all Python scripts with Pyright #8341
Conversation
The cache is not shared between branches, and it's 250MB in size, so it would become quite a big part of the 10GB cache limit of the repo.
Apparently, gcc applies optimisations even when pre-processing, which confuses pycparser.
I'll be merging this soon, because otherwise there's nothing warning anyone of the new type errors they introduce in Python scripts, unless they already use a static type checker. For example, #8031 and #8048 were both recently merged and introduced new type errors in some Python scripts which I've fixed in 6f215f1. |
I'm testing with:
Using Since there are no nested scopes, |
Thanks for sharing your setup. I've also tested
(which are the versions I get with
Hmm, when I try with
Yes, because |
The "information" level otherwise has entries from 'examples/pydantic_models_to_grammar.py', which could be confusing for someone trying to figure out what failed, considering that these messages can safely be ignored even though they look like errors.
* py : type-check all Python scripts with Pyright * server-tests : use trailing slash in openai base_url * server-tests : add more type annotations * server-tests : strip "chat" from base_url in oai_chat_completions * server-tests : model metadata is a dict * ci : disable pip cache in type-check workflow The cache is not shared between branches, and it's 250MB in size, so it would become quite a big part of the 10GB cache limit of the repo. * py : fix new type errors from master branch * tests : fix test-tokenizer-random.py Apparently, gcc applies optimisations even when pre-processing, which confuses pycparser. * ci : only show warnings and errors in python type-check The "information" level otherwise has entries from 'examples/pydantic_models_to_grammar.py', which could be confusing for someone trying to figure out what failed, considering that these messages can safely be ignored even though they look like errors.
* py : type-check all Python scripts with Pyright * server-tests : use trailing slash in openai base_url * server-tests : add more type annotations * server-tests : strip "chat" from base_url in oai_chat_completions * server-tests : model metadata is a dict * ci : disable pip cache in type-check workflow The cache is not shared between branches, and it's 250MB in size, so it would become quite a big part of the 10GB cache limit of the repo. * py : fix new type errors from master branch * tests : fix test-tokenizer-random.py Apparently, gcc applies optimisations even when pre-processing, which confuses pycparser. * ci : only show warnings and errors in python type-check The "information" level otherwise has entries from 'examples/pydantic_models_to_grammar.py', which could be confusing for someone trying to figure out what failed, considering that these messages can safely be ignored even though they look like errors.
This fixes pretty much all type errors and warnings in all python scripts of this repo, I've also added a GitHub workflow to run Pyright on all the Python scripts to make sure new changes don't introduce type errors.
The local equivalent is simply to use
pyright
(without arguments), because the config is all inpyrightconfig.json
.This should help with catching typos, invalid imports, and type errors in general, and also things which are not valid in the targeted Python version. (
pyright
allows targeting different Python versions for different parts of the code, which is nice)Some of the Python scripts are simply not statically type-checkable, like
examples/pydantic_models_to_grammar.py
and some minor parts of other scripts, so appropriate# pyright: ignore[somethingHere]
comments were added.Note that this upgrades the
openai
python library used by the server test suite to1.30
(from0.25
). And this PR also includes the cpu-only torch changes from #8335, to make fetching dependencies faster.I've added
requirements/requirements-all.txt
and other missing requirements files to allow the Pyright workflow to build a Python environment with all the dependencies used by the scripts in this repo. Note that Pyright does verify the validity of imports, so it will output warnings or errors about invalid imports even for non-toplevel imports, unlikescripts/check-requirements.txt
, but it's not using separate venvs for each scripts, so it can't really check the requirements files themselves. (by the way, there is an invalid non-toplevel import onmaster
inconvert_llama_ggml_to_gguf.py
which I noticed with this (it wasimport convert
, but that script was moved and renamed))TODO
Test the scripts which were changed more than simply adding type annotations, to check if I broke anything.
convert_hf_to_gguf.py
@compiladeJAIS
model conversion withconvert_hf_to_gguf.py
@fmzdata_torch._data[0]
is not using a valid field oftorch.Tensor
, so I've replaced it withdata_torch[0].item()
.llama-server
test suiteopenai~=1.30
fromopenai~=0.25
, and a lack of type annotations which made it harder to debug.ggml/ggml_vk_generate_shaders.py
@0cc4mbase_dict
variable seems to be a constant, and was possibly unbound below the loop, so I moved it outside, but I'm not sure if that's the right thing here.tests/test-tokenizer-random.py
@jaime-m-pcffi
is known to work with this?1.16.0
. A more useful question would be "whichgcc
version is known to work with this?", because it seems likegcc
13.3.0 runs optimizations in the pre-processor step (unless when using-O0
), which confusespycparser
.-O0
in the call togcc
in thetest-tokenizer-random.py
script.generator.__qualname__
have the same behavior whichgenerator.__name__
had?I have read the contributing guidelines
Self-reported review complexity: