Meta-Llama-3 model text-generation example output is unexpected on 2 nodes #1451

aslanxie · 2024-10-23T02:30:43Z

System Info

deepspeed                 0.14.4+hpu.synapse.v1.18.0
optimum-habana            1.14.0

docker image: vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Setup 2 nodes for test
Run text generation example
python3 ../gaudi_spawn.py --hostfile hostfile --use_deepspeed --world_size 16 --master_port 29500 \ run_generation.py \ --model_name_or_path /data1/zhixue/Llama-3.1-70B-Instruct/ \ --bf16 \ --batch_size 1 \ --use_hpu_graphs --limit_hpu_graphs \ --max_new_tokens 512
The generation output looks like:
10.233.108.205: Input/outputs: 10.233.108.205: input 1: ('DeepSpeed is a machine learning framework',) 10.233.108.205: output 1: ('DeepSpeed is a machine learning framework!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!',)

Expected behavior

If test with model Llama-2-7b-hf, the output is below. I found the issue on the latest meta-llama-3 and meta-llama-3.1 with 2 nodes inference.
10.233.108.205: input 1: ('DeepSpeed is a machine learning framework',) 10.233.108.205: output 1: ('DeepSpeed is a machine learning framework for deep learning. It is designed to be fast and efficient, while also being easy to use. DeepSpeed is based on the TensorFlow framework, and it uses the TensorFlow Lite library to run on mobile devices.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is',)

The text was updated successfully, but these errors were encountered:

aslanxie · 2024-10-23T02:38:23Z

No problem on singe node.

aslanxie · 2024-10-23T02:39:31Z

It's no problem on single node.

regisss · 2024-10-23T08:57:28Z

So you see this issue on 2 nodes right?

aslanxie · 2024-10-23T09:34:00Z

Yes, it's only on 2 nodes with Llama-2-70b-hf or Llama-3.1-70B-Instruct.

regisss · 2024-11-26T18:21:55Z

@aslanxie Still seeing this issue installing Optimum Habana's main branch from source?

aslanxie added the bug Something isn't working label Oct 23, 2024

aslanxie closed this as completed Oct 23, 2024

aslanxie reopened this Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meta-Llama-3 model text-generation example output is unexpected on 2 nodes #1451

Meta-Llama-3 model text-generation example output is unexpected on 2 nodes #1451

aslanxie commented Oct 23, 2024

aslanxie commented Oct 23, 2024

aslanxie commented Oct 23, 2024

regisss commented Oct 23, 2024

aslanxie commented Oct 23, 2024

regisss commented Nov 26, 2024

Meta-Llama-3 model text-generation example output is unexpected on 2 nodes #1451

Meta-Llama-3 model text-generation example output is unexpected on 2 nodes #1451

Comments

aslanxie commented Oct 23, 2024

System Info

Information

Tasks

Reproduction

Expected behavior

aslanxie commented Oct 23, 2024

aslanxie commented Oct 23, 2024

regisss commented Oct 23, 2024

aslanxie commented Oct 23, 2024

regisss commented Nov 26, 2024