-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChatGLM3-6B LoRA Fine-tuning Demo #11450
Conversation
python/llm/example/GPU/LLM-Finetuning/LoRA/chatglm_finetune/lora_finetune_chatglm.py
Outdated
Show resolved
Hide resolved
python/llm/example/GPU/LLM-Finetuning/LoRA/chatglm_finetune/lora_finetune_chatglm.py
Outdated
Show resolved
Hide resolved
The |
pip install "numpy<2.0.0" | ||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default | ||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ | ||
pip install oneccl_bind_pt==2.1.100 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single ARC doesn't need oneccl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary, as XPU accelerator needs CCL. Without CCL, accelerator will switch to CUDA, and trainer will schedule model to CPU rather than XPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other, LGTM
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
# This is ported from https://github.com/THUDM/ChatGLM3/blob/main/finetune_demo/lora_finetune.ipynb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please highlight in the code which lines need to be changed for running on Arc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, as well as lora_finetune_chatglm.py
.
from ipex_llm.transformers import AutoModelForCausalLM | ||
from ipex_llm.transformers.qlora import get_peft_model | ||
import os | ||
os.environ["ACCELERATE_USE_XPU"] = "true" | ||
model = AutoModelForCausalLM.from_pretrained( | ||
model_dir, | ||
trust_remote_code=True, | ||
load_in_low_bit="bf16", | ||
optimize_model=False, | ||
empty_init=False, | ||
use_cache=False, | ||
torch_dtype=torch.bfloat16 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Make it clear that L414-426 have been modified (e.g., add begin and end of change comments)
- do we need to use
load_in_low_bit="bf16"
? - Is it possible to use
llm_patch
to minimize the changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turned to use llm_patch
.
) | ||
if peft_config.peft_type.name == "LORA": | ||
# Add below L417 to enable accelerator to schedule model to Intel Arc XPU | ||
os.environ["ACCELERATE_USE_XPU"] = "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add this to llm_patch
? @qiyuangong @plusbang ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can use llm_patch to minimize changes, then yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, maybe we could add this to llm_patch
just in this PR.
python process_advertise_gen_dataset.py | ||
``` | ||
|
||
Then, './AdvertiseGen' will be converted to './AdvertiseGen_fix'. Now, we have prepared the dataset, and are going to start LoRA fine-tuning on ChatGLM3-6B. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
' changes to ` will high this work
* ChatGLM3-6B LoRA Fine-tuning Demo * refine * refine * add 2-card deepspeed * refine format * add mpi4py and deepspeed install
Description
Port ChatGLM repo's finetune demo to ipex-llm, running on Arc XPU.
1. Why the change?
as above
2. User API changes
will replace the old chatglm lora finetuning.
3. Summary of the change
ChatGLM3-6B LoRA Fine-tuning Demo
4. How to test?
5. New dependencies
pip install "jieba>=0.42.1"
pip install "ruamel_yaml>=0.18.6"
pip install "rouge_chinese>=1.0.3"
pip install "jupyter>=1.0.0"
pip install "typer"
pip install "nltk"