Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM: unify memory optimization env variables. #11549

Merged
merged 2 commits into from
Jul 11, 2024

Conversation

lalalapotter
Copy link
Contributor

Description

This PR is to unify memory optimization env variables to IPEX_LLM_LOW_MEM for users to easily enable memory optimizations. We also keep IPEX_LLM_LAST_LM_HEAD and IPEX_LLM_SPLIT_QKV for compatibility and disabling the specific optimization for some situations.

@lalalapotter lalalapotter requested a review from hkvision July 10, 2024 02:00
@lalalapotter lalalapotter self-assigned this Jul 10, 2024
Comment on lines 335 to 336
else:
optimize_lm_head = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need this? default to be false in Line 327?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

if os.environ.get("IPEX_LLM_LAST_LM_HEAD", None) is not None:
optimize_lm_head = os.environ.get("IPEX_LLM_LAST_LM_HEAD", None) == "1"
elif os.environ.get("IPEX_LLM_LOW_MEM", None) is not None:
optimize_lm_head = os.environ.get("IPEX_LLM_LOW_MEM", None) == "1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should disable optimize_lm_head when using speculative or lookahead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have conditions to avoid impacting speculative . But magic value 10 maybe not a good value. in

if shape[1] > 10:
shape[1] = 1
x = x[:, -1, :].view(shape)
return x

Copy link
Contributor

@qiyuangong qiyuangong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lalalapotter
Copy link
Contributor Author

@lalalapotter lalalapotter merged commit 70ab1a6 into intel-analytics:main Jul 11, 2024
1 check passed
RyuKosei pushed a commit to RyuKosei/ipex-llm that referenced this pull request Jul 19, 2024
* LLM: unify memory optimization env variables.

* fix comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants