-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert.py: add python logging instead of print() #6511
Conversation
2a2b801
to
e4e4df5
Compare
Good suggestion, however you are using the logging module wrong. You must at the very least create and use a named instance with logging.getLogger(), see the documentation. |
The provided usage is not exactly wrong. You're not restricted from using library methods directly. IMO, for a single python module (which is a standalone script here), it's fine to use default logger configuration and library methods if you don't need complex logging. But I'm not aware of conventions here, so maybe other maintainers can say if it's ok or not. |
While true it's a good idea not to, it will make things a lot easier in the long run. Just because you can doesn't mean you should. :)
Ultimately it is up to whomever reviews this PR of course. |
No probs. Made the adjustment. Hopefully it matches what you are thinking of |
Almost. :) You missed a couple of |
You mean do_dump_model()? well in that case, its explicit a dump so I suppress all logging (unless --verbose is active) and only print whats on do_dump_model() happy to adjust behavior if that's unexpected. |
No, I mean from here and onwards. |
Ah dang. Yeah I see what you mean. Fixed thanks |
Great, but this was probably not intended though. :) |
Fixed. |
Since logging is being introduced, we could also change I'm also wondering what'd be the process of ensuring that later PRs don't introduce new print statements to the code. Up for reviewers to detect, or add pipeline checks/tests - e.g. add |
Nit: |
baddc3f
to
b9c984b
Compare
Alright then, I've figured out how to run flake8 directly (after failing to get pre-commit to work) and start using the flake8 no print allowed plugin. It then threw a whole bunch of errors on all the other pythons scripts in this repo. So decided to give correcting it all a stab. Also updated the github actions and pre-commit config to use this feature so hopefully it now works. Since this is changing quite a few scripts, I had to spend a bit more time on other .py scripts that had an older approach to using print() especially if ending with Unfortunately this has now make this PR bit more daunting to review compared to the past, but with the inclusion of the no-print check this should hopefully be a one time pain for everyone here. :) Regarding my issue getting pre-commit workingThis is what I got. I've looked around but haven't found a solution yet. Hopefully its not an issue for others working on this.
|
3e6ff06
to
9e4cf37
Compare
rebased on top of latest master due to extra breaking changes on convert-hf-to-gguf.py (more print() was added in the main repo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay just finished making all the changes I noted from you (besides the flake8 enquiry which I'm taking as a comment for now).
I also took a bit of time to look over it again and it looks fine on my end so far.
Great work. :) A final note however, you should use |
One of your previous statement appears to state the opposite.
So you are saying just specifically for |
Anyway added your suggestion in @CISC hopefully it should all check out! |
Yes, because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually double checked and clicked viewed on all pages. Found one unused logger import and fixed it. Aside from that, looks ready.
89a75d3
to
a6f54de
Compare
Rebased to add one additional |
I have a pattern we can use to centralize the logger instantiation with. The implementation is really simple and flexible. LOGGER_FORMAT = "%(asctime)s - %(filename)s:%(lineno)d - %(levelname)s - %(message)s"
def get_default_logger(name=None, level=logging.DEBUG):
logger = logging.getLogger(name)
if not logger.hasHandlers():
handler = logging.StreamHandler(stream=sys.stdout)
formatter = logging.Formatter(LOGGER_FORMAT)
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(level)
return logger I left the docstring out of the snippet shown here, but this is basically it. You're free to use it to simplify the implementation and reduce code duplication. I've been able to use this in classes, functions, and scripts and regularly use it. It's under the MIT License, so it's compatible with This pattern can be extended over time and included in the I never liked the use of the global |
Happy to add if @cebtenzzre thinks it's a good idea. What's the advantage of a centralized logger instance? Would this be better as a separate PR (Which would be good if it's only a small change to each file). The main objective of this PR is simply to convert from print() to logger in a preferably simple and extendable way, so at least for this one I would prefer to minimize adding dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hate to do this to you, but you got the wrong one, the correct one is at L1189, please revert 6d42f3d :)
Ah that's not a problem. Ultimately, this is mostly just a quality of life fix to make developer life smoother. But I do hope to eventually get this in, as better debugging will ultimately make everyone's life easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is overall quite good. Feel free to ignore/"resolve" any of the comments I made if they are too nit-picky.
Okay implemented most of your suggestions @compilade but excluded some style observations as it doesn't appear to be an issue. edit: Accodentally typed cebtenzzre. Intended to invoke compilade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Merging this will cause conflicts in at least #7041, #7036, #7033, #7031, #7027, #7018, #6942, #6866 and soft conflicts with at least #7020, #6986, #6919, #6826. (soft conflicts in this case mean new linter errors, but no merge conflict)
Those involved in these PRs will probably want to take at least a cursory look at this before then (hopefully soon).
Maybe this one should be prioritized for now? |
Thanks for merging! Hopefully all is well for other PRs! |
On my Macbook I don't see the print's anymore. For example: python3 convert-hf-to-gguf-update.py should print usage, but after this PR it does not print anything. Any ideas? |
I see what's going on. By default in https://docs.python.org/3/howto/logging.html logging level is only warning and up.
To get it printing again you need to add This was missed because it was not using the typical We may want to convert it to ArgumentParser() so we can add the On a side note you may want to also |
* convert.py: add python logging instead of print() * convert.py: verbose flag takes priority over dump flag log suppression * convert.py: named instance logging * convert.py: use explicit logger id string * convert.py: convert extra print() to named logger * convert.py: sys.stderr.write --> logger.error * *.py: Convert all python scripts to use logging module * requirements.txt: remove extra line * flake8: update flake8 ignore and exclude to match ci settings * gh-actions: add flake8-no-print to flake8 lint step * pre-commit: add flake8-no-print to flake8 and also update pre-commit version * convert-hf-to-gguf.py: print() to logger conversion * *.py: logging basiconfig refactor to use conditional expression * *.py: removed commented out logging * fixup! *.py: logging basiconfig refactor to use conditional expression * constant.py: logger.error then exit should be a raise exception instead * *.py: Convert logger error and sys.exit() into a raise exception (for atypical error) * gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar * verify-checksum-model.py: This is the result of the program, it should be printed to stdout. * compare-llama-bench.py: add blank line for readability during missing repo response * reader.py: read_gguf_file() use print() over logging * convert.py: warning goes to stderr and won't hurt the dump output * gguf-dump.py: dump_metadata() should print to stdout * convert-hf-to-gguf.py: print --> logger.debug or ValueError() * verify-checksum-models.py: use print() for printing table * *.py: refactor logging.basicConfig() * gguf-py/gguf/*.py: use __name__ as logger name Since they will be imported and not run directly. * python-lint.yml: use .flake8 file instead * constants.py: logger no longer required * convert-hf-to-gguf.py: add additional logging * convert-hf-to-gguf.py: print() --> logger * *.py: fix flake8 warnings * revert changes to convert-hf-to-gguf.py for get_name() * convert-hf-to-gguf-update.py: use triple quoted f-string instead * *.py: accidentally corrected the wrong line * *.py: add compilade warning suggestions and style fixes
Hmm I wonder why not all f-strings have been converted to %s parameters like all logging libraries suggest for a bit lazier evaluation 😅 |
If it would impact performance in any meaningful way, sure, otherwise no. IMO f-strings are much better and less error prone. |
* convert.py: add python logging instead of print() * convert.py: verbose flag takes priority over dump flag log suppression * convert.py: named instance logging * convert.py: use explicit logger id string * convert.py: convert extra print() to named logger * convert.py: sys.stderr.write --> logger.error * *.py: Convert all python scripts to use logging module * requirements.txt: remove extra line * flake8: update flake8 ignore and exclude to match ci settings * gh-actions: add flake8-no-print to flake8 lint step * pre-commit: add flake8-no-print to flake8 and also update pre-commit version * convert-hf-to-gguf.py: print() to logger conversion * *.py: logging basiconfig refactor to use conditional expression * *.py: removed commented out logging * fixup! *.py: logging basiconfig refactor to use conditional expression * constant.py: logger.error then exit should be a raise exception instead * *.py: Convert logger error and sys.exit() into a raise exception (for atypical error) * gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar * verify-checksum-model.py: This is the result of the program, it should be printed to stdout. * compare-llama-bench.py: add blank line for readability during missing repo response * reader.py: read_gguf_file() use print() over logging * convert.py: warning goes to stderr and won't hurt the dump output * gguf-dump.py: dump_metadata() should print to stdout * convert-hf-to-gguf.py: print --> logger.debug or ValueError() * verify-checksum-models.py: use print() for printing table * *.py: refactor logging.basicConfig() * gguf-py/gguf/*.py: use __name__ as logger name Since they will be imported and not run directly. * python-lint.yml: use .flake8 file instead * constants.py: logger no longer required * convert-hf-to-gguf.py: add additional logging * convert-hf-to-gguf.py: print() --> logger * *.py: fix flake8 warnings * revert changes to convert-hf-to-gguf.py for get_name() * convert-hf-to-gguf-update.py: use triple quoted f-string instead * *.py: accidentally corrected the wrong line * *.py: add compilade warning suggestions and style fixes
* convert.py: add python logging instead of print() * convert.py: verbose flag takes priority over dump flag log suppression * convert.py: named instance logging * convert.py: use explicit logger id string * convert.py: convert extra print() to named logger * convert.py: sys.stderr.write --> logger.error * *.py: Convert all python scripts to use logging module * requirements.txt: remove extra line * flake8: update flake8 ignore and exclude to match ci settings * gh-actions: add flake8-no-print to flake8 lint step * pre-commit: add flake8-no-print to flake8 and also update pre-commit version * convert-hf-to-gguf.py: print() to logger conversion * *.py: logging basiconfig refactor to use conditional expression * *.py: removed commented out logging * fixup! *.py: logging basiconfig refactor to use conditional expression * constant.py: logger.error then exit should be a raise exception instead * *.py: Convert logger error and sys.exit() into a raise exception (for atypical error) * gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar * verify-checksum-model.py: This is the result of the program, it should be printed to stdout. * compare-llama-bench.py: add blank line for readability during missing repo response * reader.py: read_gguf_file() use print() over logging * convert.py: warning goes to stderr and won't hurt the dump output * gguf-dump.py: dump_metadata() should print to stdout * convert-hf-to-gguf.py: print --> logger.debug or ValueError() * verify-checksum-models.py: use print() for printing table * *.py: refactor logging.basicConfig() * gguf-py/gguf/*.py: use __name__ as logger name Since they will be imported and not run directly. * python-lint.yml: use .flake8 file instead * constants.py: logger no longer required * convert-hf-to-gguf.py: add additional logging * convert-hf-to-gguf.py: print() --> logger * *.py: fix flake8 warnings * revert changes to convert-hf-to-gguf.py for get_name() * convert-hf-to-gguf-update.py: use triple quoted f-string instead * *.py: accidentally corrected the wrong line * *.py: add compilade warning suggestions and style fixes
* convert.py: add python logging instead of print() * convert.py: verbose flag takes priority over dump flag log suppression * convert.py: named instance logging * convert.py: use explicit logger id string * convert.py: convert extra print() to named logger * convert.py: sys.stderr.write --> logger.error * *.py: Convert all python scripts to use logging module * requirements.txt: remove extra line * flake8: update flake8 ignore and exclude to match ci settings * gh-actions: add flake8-no-print to flake8 lint step * pre-commit: add flake8-no-print to flake8 and also update pre-commit version * convert-hf-to-gguf.py: print() to logger conversion * *.py: logging basiconfig refactor to use conditional expression * *.py: removed commented out logging * fixup! *.py: logging basiconfig refactor to use conditional expression * constant.py: logger.error then exit should be a raise exception instead * *.py: Convert logger error and sys.exit() into a raise exception (for atypical error) * gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar * verify-checksum-model.py: This is the result of the program, it should be printed to stdout. * compare-llama-bench.py: add blank line for readability during missing repo response * reader.py: read_gguf_file() use print() over logging * convert.py: warning goes to stderr and won't hurt the dump output * gguf-dump.py: dump_metadata() should print to stdout * convert-hf-to-gguf.py: print --> logger.debug or ValueError() * verify-checksum-models.py: use print() for printing table * *.py: refactor logging.basicConfig() * gguf-py/gguf/*.py: use __name__ as logger name Since they will be imported and not run directly. * python-lint.yml: use .flake8 file instead * constants.py: logger no longer required * convert-hf-to-gguf.py: add additional logging * convert-hf-to-gguf.py: print() --> logger * *.py: fix flake8 warnings * revert changes to convert-hf-to-gguf.py for get_name() * convert-hf-to-gguf-update.py: use triple quoted f-string instead * *.py: accidentally corrected the wrong line * *.py: add compilade warning suggestions and style fixes
This has broken |
@mofosyne PTAL - usage instructions are here: llama.cpp/scripts/compare-llama-bench.py Lines 45 to 57 in c780e75
|
Motivation of this PR is to switch from print() to python logging (battery included package for python), this is so that when I add the
--get-outfile
command to get the calculated conventional gguf filename it doesn't get polluted by erroneous print() messages.