Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU acceleration failed #79

Open
doubtfire009 opened this issue Apr 6, 2024 · 1 comment
Open

GPU acceleration failed #79

doubtfire009 opened this issue Apr 6, 2024 · 1 comment

Comments

@doubtfire009
Copy link

doubtfire009 commented Apr 6, 2024

I use the code here:
https://github.com/intel-analytics/ipex-llm-tutorial/blob/original-bigdl-llm/Chinese_Version/ch_6_GPU_Acceleration/6_1_GPU_Llama2-7B.md

But failed. Can you help with this?
Thanks.

`from bigdl.llm.transformers import AutoModelForCausalLM, AutoModel
from transformers import LlamaTokenizer, AutoTokenizer

chatglm3_6b = 'D:/AI_projects/Langchain-Chatchat/llm_model/THUDM/chatglm2-6b'

model_in_4bit = AutoModel.from_pretrained(pretrained_model_name_or_path=chatglm3_6b,
load_in_4bit=True,
optimize_model=False)
model_in_4bit_gpu = model_in_4bit.to('xpu')

请注意,这里的 AutoModelForCausalLM 是从 bigdl.llm.transformers 导入的

model_in_8bit = AutoModelForCausalLM.from_pretrained(

pretrained_model_name_or_path=chatglm3_6b,

load_in_low_bit="sym_int8",

optimize_model=False

)

model_in_8bit_gpu = model_in_8bit.to('xpu')

tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=chatglm3_6b)

`

The error shows:

(llm_310_whl) D:\AI_projects\ipex-samples>python main-test.py C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpegorlibpnginstalled before buildingtorchvision from source? warn( 2024-04-07 00:20:03,696 - INFO - intel_extension_for_pytorch auto imported Traceback (most recent call last): File "D:\AI_projects\ipex-samples\main-test.py", line 6, in <module> model_in_4bit = AutoModel.from_pretrained(pretrained_model_name_or_path=chatglm3_6b, File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\bigdl\llm\transformers\model.py", line 320, in from_pretrained model = cls.load_convert(q_k, optimize_model, *args, **kwargs) File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\bigdl\llm\transformers\model.py", line 434, in load_convert model = cls.HF_Model.from_pretrained(*args, **kwargs) File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\transformers\models\auto\auto_factory.py", line 461, in from_pretrained config, kwargs = AutoConfig.from_pretrained( File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\transformers\models\auto\configuration_auto.py", line 986, in from_pretrained trust_remote_code = resolve_trust_remote_code( File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\transformers\dynamic_module_utils.py", line 535, in resolve_trust_remote_code signal.signal(signal.SIGALRM, _raise_timeout_error) AttributeError: module 'signal' has no attribute 'SIGALRM'. Did you mean: 'SIGABRT'?

And the pip list is:
accelerate 0.21.0
aiohttp 3.9.3
aiosignal 1.3.1
altair 4.2.2
annotated-types 0.6.0
astor 0.8.1
asttokens 2.4.1
async-timeout 4.0.3
attrs 23.2.0
bigdl-core-xe-21 2.5.0b20240324
bigdl-llm 2.5.0b20240406
blinker 1.7.0
cachetools 5.3.3
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
contourpy 1.2.1
cryptography 42.0.5
cycler 0.12.1
dataclasses-json 0.5.14
decorator 5.1.1
entrypoints 0.4
exceptiongroup 1.2.0
executing 2.0.1
faiss-cpu 1.8.0
filelock 3.13.3
fonttools 4.51.0
frozenlist 1.4.1
fsspec 2024.3.1
gitdb 4.0.11
GitPython 3.1.43
google-ai-generativelanguage 0.2.0
google-api-core 2.18.0
google-auth 2.29.0
google-generativeai 0.1.0
googleapis-common-protos 1.63.0
greenlet 3.0.3
grpcio 1.62.1
grpcio-status 1.48.2
huggingface-hub 0.22.2
idna 3.6
importlib_metadata 7.1.0
intel-extension-for-pytorch 2.1.10+xpu
intel-openmp 2024.1.0
ipython 8.23.0
jedi 0.19.1
Jinja2 3.1.3
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
langchain 0.0.180
markdown-it-py 3.0.0
MarkupSafe 2.1.5
marshmallow 3.21.1
matplotlib 3.8.4
matplotlib-inline 0.1.6
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.5
mypy-extensions 1.0.0
networkx 3.3
numexpr 2.10.0
numpy 1.26.4
openai 0.27.7
openapi-schema-pydantic 1.2.4
packaging 24.0
pandas 2.2.1
pandasai 0.2.15
parso 0.8.4
pdfminer.six 20231228
pdfplumber 0.11.0
pillow 10.3.0
pip 23.3.1
prompt-toolkit 3.0.43
proto-plus 1.23.0
protobuf 3.20.3
psutil 5.9.8
pure-eval 0.2.2
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyasn1 0.6.0
pyasn1_modules 0.4.0
pycparser 2.22
pydantic 1.10.15
pydantic_core 2.16.3
pydeck 0.8.1b0
Pygments 2.17.2
Pympler 1.0.1
pyparsing 3.1.2
pypdf 3.9.0
pypdfium2 4.28.0
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
pytz 2024.1
PyYAML 6.0.1
referencing 0.34.0
regex 2023.12.25
requests 2.31.0
rich 13.7.1
rpds-py 0.18.0
rsa 4.9
safetensors 0.4.2
sentencepiece 0.2.0
setuptools 68.2.2
six 1.16.0
smmap 5.0.1
SQLAlchemy 2.0.29
stack-data 0.6.3
streamlit 1.22.0
streamlit-chat 0.0.2.2
sympy 1.12
tabulate 0.9.0
tenacity 8.2.3
tiktoken 0.4.0
tokenizers 0.13.3
toml 0.10.2
toolz 0.12.1
torch 2.1.0a0+cxx11.abi
torchaudio 2.1.0a0+cxx11.abi
torchvision 0.16.0a0+cxx11.abi
tornado 6.4
tqdm 4.66.2
traitlets 5.14.2
transformers 4.31.0
typing_extensions 4.11.0
typing-inspect 0.9.0
tzdata 2024.1
tzlocal 5.2
urllib3 2.2.1
validators 0.28.0
watchdog 4.0.0
wcwidth 0.2.13
wheel 0.41.2
yarl 1.9.4
youtube-transcript-api 0.6.0
zipp 3.18.1

and sym_int8 also fails.

@Oscilloscope98
Copy link
Contributor

Oscilloscope98 commented Apr 7, 2024

Hi @doubtfire009,

For chatglm3-6b, you could refer to https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm3 regarding how to run it on Intel GPU with IPEX-LLM optimizations.

You could load it with ipex_llm.transformers.AutoModel through

from ipex_llm.transformers import AutoModel

model = AutoModel.from_pretrained(model_path,
                                  load_in_4bit=True,
                                  optimize_model=True,
                                  trust_remote_code=True,
                                  use_cache=True)
model = model.to('xpu')

It seems that you are missing trust_remote_code=True.

You could also refer to here for more information regarding installing IPEX-LLM on Intel GPUs,

Please let us know for any further problems :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants