Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ecosystem] enable saving and loading FP8 model(#53) #1683

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

xin3he
Copy link
Contributor

@xin3he xin3he commented Jan 8, 2025

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@xin3he xin3he requested a review from regisss as a code owner January 8, 2025 02:01
@yafshar
Copy link
Contributor

yafshar commented Jan 8, 2025

@xin3he Could you please remove 'software ticket' and 'OHF' from the title? This PR is for OH

@yafshar
Copy link
Contributor

yafshar commented Jan 9, 2025

@xin3he can you please address the comments. everything else sounds good to me!

@xin3he xin3he changed the title [SW-211858] [Ecosystem] enable saving and loading FP8 model in OHF (#53) [Ecosystem] enable saving and loading FP8 model(#53) Jan 13, 2025
@xin3he
Copy link
Contributor Author

xin3he commented Jan 13, 2025

Surely, thank you @yafshar, sorry for the delay response.

Copy link
Contributor

@yafshar yafshar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Hi @regisss, this PR is ready for your final review. Could you please take a look?

@xin3he
Copy link
Contributor Author

xin3he commented Jan 17, 2025

A reminder of TODO:

  1. We need to add multi-cards saving and loading after this bug fix is merged into Habana software. Support pure meta model lm_head tp deepspeedai/DeepSpeed#6812.
  2. Will remove maxabs_quant_const_scales.json after PR is merged into Habana software. https://github.com/habana-internal/neural-compressor-fork/pull/6

May happen in v1.20.0.

examples/text-generation/run_generation.py Outdated Show resolved Hide resolved
examples/text-generation/README.md Outdated Show resolved Hide resolved
examples/text-generation/README.md Outdated Show resolved Hide resolved
examples/text-generation/README.md Show resolved Hide resolved
@xin3he
Copy link
Contributor Author

xin3he commented Feb 7, 2025

A reminder of TODO:

  1. We need to add multi-cards saving and loading after this bug fix is merged into Habana software. Support pure meta model lm_head tp deepspeedai/DeepSpeed#6812.
  2. Will remove maxabs_quant_const_scales.json after PR is merged into Habana software. [SW-205970] update state_dict to save scalar scales habana-internal/neural-compressor-fork#6

May happen in v1.20.0.

@regisss, since PRs mentioned above are all merged, I updated this PR for 1.20.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants