-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expose scan_cache table generation to python #2437
Conversation
Allows for the functionality to be used with servers/gradio projects: ```python from huggingface_hub import scan_cache_dir from huggingface_hub.commands.scan_cache import get_table hf_cache_info = scan_cache_dir() table = get_table(0, hf_cache_info) print(table) ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rsxdalv, makes sense to me. The changes looks good to me. Left a minor comment regarding the helper signature.
In addition to this, could you also add
### scan_cache.get_table
[[autodoc]] huggingface_hub.commands.scan_cache.get_table
to the cache.md package reference.. This would add it to the official documentation under https://huggingface.co/docs/huggingface_hub/package_reference/cache. You must add a quick docstring to get_table
to explain what it does, its inputs and an example. You can take inspiration from this docstring for example. Thanks a lot in advance!
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Co-authored-by: Lucain <[email protected]>
Thanks for the information! I added a docstring and the reference to en/package_reference/cache.md I was not sure about the ko/package_reference/cache.md , so I did not add it there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes @rsxdalv! I left minor comments mostly related to how the doc builder works.
Regarding the ko
documentation, could you add
### scan_cache.get_table[[huggingface_hub.commands.scan_cache.get_table]]
[[autodoc]] huggingface_hub.commands.scan_cache.get_table
to it? (good catch, I forgot about it^^)
Other than that, we should be good to merge!
Co-authored-by: Lucain <[email protected]>
Co-authored-by: Lucain <[email protected]>
Thanks, I accepted all the changes and added the ko/cache.md reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thank you @rsxdalv, I think we are good to go now :) Let's wait for the CI to complete and then I'll merge.
EDIT: code quality seems to be complaining. To fix this, you must run make style
locally which will fix the issues. Then you can run make quality
to check everything's good. Finally, you can commit and push the changes.
Hi @rsxdalv, I've had a look at the CI issues. It was quite annoying as the docs were failing because Code example is now: >>> from huggingface_hub.utils import scan_cache_dir
>>> hf_cache_info = scan_cache_dir()
HFCacheInfo(...)
>>> print(hf_cache_info.export_as_table())
REPO ID REPO TYPE SIZE ON DISK NB FILES LAST_ACCESSED LAST_MODIFIED REFS LOCAL PATH
--------------------------------------------------- --------- ------------ -------- ------------- ------------- ---- --------------------------------------------------------------------------------------------------
roberta-base model 2.7M 5 1 day ago 1 week ago main C:\\Users\\admin\\.cache\\huggingface\\hub\\models--roberta-base
suno/bark model 8.8K 1 1 week ago 1 week ago main C:\\Users\\admin\\.cache\\huggingface\\hub\\models--suno--bark
t5-base model 893.8M 4 4 days ago 7 months ago main C:\\Users\\admin\\.cache\\huggingface\\hub\\models--t5-base
t5-large model 3.0G 4 5 weeks ago 5 months ago main C:\\Users\\admin\\.cache\\huggingface\\hub\\models--t5-large Sorry about pushing changes to your branch, I hope that's fine for you. Once the CI is green, I'll merged it! 😄 EDIT: it's (finally) green! 🎉 |
Thank you for carrying this code change to it's inclusion! I am currently developing a module for managing the HF cache within my project, once it is more clear what works and what does not, I hope to make another PR. One problem that I already know will need to be solved is distinguishing the different revisions from files. If a user sees that by deleting revision 3fd77fe... they will reclaim 10 GBs but actually those files are shared amongst 3 revisions that 'lock' them in, they will be confused. So now the only solution is to educate users about what are revisions and how 'results may warry' when deleting them. |
Yes, I see what you mean. In the |
Actually that does help, thanks for letting me know. |
Allows for the functionality to be used with servers/gradio projects: