LLM: unify memory optimization env variables. #11549

lalalapotter · 2024-07-10T02:00:27Z

Description

This PR is to unify memory optimization env variables to IPEX_LLM_LOW_MEM for users to easily enable memory optimizations. We also keep IPEX_LLM_LAST_LM_HEAD and IPEX_LLM_SPLIT_QKV for compatibility and disabling the specific optimization for some situations.

hkvision · 2024-07-10T06:03:55Z

python/llm/src/ipex_llm/transformers/convert.py

+                    else:
+                        optimize_lm_head = False


no need this? default to be false in Line 327?

qiyuangong · 2024-07-10T09:20:06Z

python/llm/src/ipex_llm/transformers/convert.py

+                    if os.environ.get("IPEX_LLM_LAST_LM_HEAD", None) is not None:
+                        optimize_lm_head = os.environ.get("IPEX_LLM_LAST_LM_HEAD", None) == "1"
+                    elif os.environ.get("IPEX_LLM_LOW_MEM", None) is not None:
+                        optimize_lm_head = os.environ.get("IPEX_LLM_LOW_MEM", None) == "1"


We should disable optimize_lm_head when using speculative or lookahead.

We already have conditions to avoid impacting speculative . But magic value 10 maybe not a good value. in

ipex-llm/python/llm/src/ipex_llm/transformers/low_bit_linear.py

Lines 343 to 346 in 51f2eff

if shape[1] > 10:

shape[1] = 1

x = x[:, -1, :].view(shape)

return x

qiyuangong

LGTM

lalalapotter · 2024-07-11T03:00:18Z

PR validation: https://github.com/intel-analytics/ipex-llm-workflow/actions/runs/9866974992

* LLM: unify memory optimization env variables. * fix comments.

LLM: unify memory optimization env variables.

5cb86bb

lalalapotter added the llm label Jul 10, 2024

lalalapotter requested a review from hkvision July 10, 2024 02:00

lalalapotter self-assigned this Jul 10, 2024

hkvision reviewed Jul 10, 2024

View reviewed changes

qiyuangong reviewed Jul 10, 2024

View reviewed changes

fix comments.

256a4e1

qiyuangong approved these changes Jul 11, 2024

View reviewed changes

lalalapotter merged commit 70ab1a6 into intel-analytics:main Jul 11, 2024
1 check passed

RyuKosei pushed a commit to RyuKosei/ipex-llm that referenced this pull request Jul 19, 2024

LLM: unify memory optimization env variables. (intel-analytics#11549)

f110c5c

* LLM: unify memory optimization env variables. * fix comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM: unify memory optimization env variables. #11549

LLM: unify memory optimization env variables. #11549

lalalapotter commented Jul 10, 2024

hkvision Jul 10, 2024

lalalapotter Jul 11, 2024

qiyuangong Jul 10, 2024

qiyuangong Jul 11, 2024

qiyuangong left a comment

lalalapotter commented Jul 11, 2024

	if shape[1] > 10:
	shape[1] = 1
	x = x[:, -1, :].view(shape)
	return x

LLM: unify memory optimization env variables. #11549

LLM: unify memory optimization env variables. #11549

Conversation

lalalapotter commented Jul 10, 2024

Description

hkvision Jul 10, 2024

Choose a reason for hiding this comment

lalalapotter Jul 11, 2024

Choose a reason for hiding this comment

qiyuangong Jul 10, 2024

Choose a reason for hiding this comment

qiyuangong Jul 11, 2024

Choose a reason for hiding this comment

qiyuangong left a comment

Choose a reason for hiding this comment

lalalapotter commented Jul 11, 2024