Unable to execute GPT-2 onnx model #783

somasundaram1702 · 2024-07-18T12:57:00Z

Hello Team,

I am trying to execute the gpt-2 model (link given below) on Mali G710 GPU. During the execution I get the below error,

./ExecuteNetwork -c GpuAcc -f onnx-binary -d /mnt/dropbox/MobileNetV2/llm.txt
-m /mnt/dropbox/LLM/gpt2-10.onnx -i input1 -s 1,4,16
Warning: DEPRECATED: The program option 'input-name' is deprecated and will be removed soon. The input-names are now automatical
ly set.
Warning: DEPRECATED: The program option 'model-format' is deprecated and will be removed soon. The model-format is now automatica
lly set.
Info: ArmNN v33.1.0
Info: Initialization time: 298.10 ms.
Fatal: Datatype INT64 is not valid for tensor 'input1' of node 'Reshape_11', not in {onnx::TensorProto::FLOAT}. at function Pars
eReshape [/devenv/armnn/src/armnnOnnxParser/OnnxParser.cpp:2319]
Info: Shutdown time: 129.43 ms.

model link: https://github.com/onnx/models/blob/main/validated/text/machine_comprehension/gpt-2/model/gpt2-10.onnx

@FrancisMurtagh-arm: I tried passing both int and float values as input, but still did not help. Can you please suggest a fix.

Colm-in-Arm · 2024-07-19T08:43:35Z

Hi,

The fatal error message indicates that there are INT64 types in the model. Our Onnx parser does not support this data type. Our ONNX parser is very outdated and has been marked for future deprecation. So unless you're willing to contribute the work yourself, I'm afraid this model won't work.

Colm.

somasundaram1702 · 2024-07-22T07:45:07Z

@Colm-in-Arm : Does tf-lite support INT64 types or do you recommend any other parsers? My objective is to run inferencing on any of the LLM model on Mali G710 GPU utilizing ExecuteNetwork. Any successful use case available ? If so, kindly can you direct me to the working model ?

Colm-in-Arm · 2024-07-23T14:20:17Z

Hi,

TfLite runtime does support INT64 in some limited cases. I don't know of other ONNX runtimes you could use.

In Arm NN we have not done any work on LLM's. The work I have seen tends to target the CPU rather than GPU. LLM's tend to be memory bound rather than CPU bound so there's not as much potential for performance increase using GPU's.

Colm.

somasundaram1702 · 2024-08-01T09:53:36Z

@Colm-in-Arm : Like to inform you that I am able to successfully execute GPT2 tflite model on Mali G710 GPU. The "gpt2-64-fp16.tflite" model worked.

Now ARM can add LLM in to their portfolio :)

Colm-in-Arm · 2024-08-02T08:51:57Z

Wow! Well done.

Can you outline the steps need to make the model small enough to push through Arm NN? Did you use ExecuteNetwork or your own application? I presume some layers were handled by the TfLite runtime? What kind of inference times were you getting? How about CpuAcc did you try it?

Colm.

somasundaram1702 · 2024-08-02T09:53:25Z

@Colm-in-Arm : I haven't reduced the size of the model. The file size of "gpt2-64-fp16.tflite" was 248 mb. Like to know why we need to reduce the model size ? Yes, I have used ExecuteNetwork with both CpuAcc & GpuAcc runtimes. CpuAcc - 25 mins and GpuAcc - 2 hours 30 mins (approx).

Note: I am executing the model on a Hybrid emulated platform (using Zebu), where the Cpu is on the virtual side and the Gpu runs on the RTL side. So it is not straight forward to compare the execution times.

Somelayers were handled by TfLite runtime ? How do we verify this? You mean few of the unsupported operations are handled by tflite runtime ?

Also, I would like to check the Gpu core, memory & power consumption. Is there any commands that I can execute from the Linux terminal to check the same ?

@Colm-in-Arm : Awaiting for response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to execute GPT-2 onnx model #783

Unable to execute GPT-2 onnx model #783

somasundaram1702 commented Jul 18, 2024 •

edited

Loading

Colm-in-Arm commented Jul 19, 2024

somasundaram1702 commented Jul 22, 2024 •

edited

Loading

Colm-in-Arm commented Jul 23, 2024

somasundaram1702 commented Aug 1, 2024

Colm-in-Arm commented Aug 2, 2024

somasundaram1702 commented Aug 2, 2024 •

edited

Loading

Unable to execute GPT-2 onnx model #783

Unable to execute GPT-2 onnx model #783

Comments

somasundaram1702 commented Jul 18, 2024 • edited Loading

Colm-in-Arm commented Jul 19, 2024

somasundaram1702 commented Jul 22, 2024 • edited Loading

Colm-in-Arm commented Jul 23, 2024

somasundaram1702 commented Aug 1, 2024

Colm-in-Arm commented Aug 2, 2024

somasundaram1702 commented Aug 2, 2024 • edited Loading

somasundaram1702 commented Jul 18, 2024 •

edited

Loading

somasundaram1702 commented Jul 22, 2024 •

edited

Loading

somasundaram1702 commented Aug 2, 2024 •

edited

Loading