llama llminference result value is strange #461

dlgktjr · 2025-01-10T05:28:30Z

Description of the bug:

I converted Llama 3.2 1B model using example. ( i have some error with latest ai-edge-torch, so i converted using 5a93316 version)

And I did the llm interference test according to the guide, but the result value is strange.

python tokenizer_to_sentencepiece.py \
    --checkpoint=meta-llama/Llama-3.2-1B-Instruct \
    --output_path=llama3.spm.model

import mediapipe as mp
from mediapipe.tasks.python.genai import bundler

config = bundler.BundleConfig(
     tflite_model="/home/haseok/tmpdir/llama/llama_1b_q8_ekv1280.tflite",
     tokenizer_model="/home/haseok/tmpdir/llama/llama.spm.model",
     start_token="<|begin_of_text|>",
     stop_tokens=["<|end_of_text|>"],
     output_filename="/home/haseok/tmpdir/srllm/llama.task",
     enable_bytes_to_unicode_mapping=False,
 )
bundler.create_bundle(config)

Is there anyone who has had an experience with llama convert? If so, is this result normal?

Actual vs expected behavior:

This is gemma2 example.

Me : Hello
Model : Hello! How can i help you today?

llama tflite

Me : Hello
Model : 
SELECT m.id AS id FROM m
  INNER JOIN t2 ON
m.id=t2.project_id
  WHERE m.category_id=m.id
....
...

Any other information you'd like to share?

No response

The text was updated successfully, but these errors were encountered:

haozha111 · 2025-01-14T21:44:50Z

The output from llama tflite is strange, have you ran this script https://github.com/google-ai-edge/ai-edge-torch/blob/main/ai_edge_torch/generative/examples/llama/verify.py to validate if the re-authored model is producing reasonable output?

dlgktjr · 2025-01-15T07:17:29Z

I0115 16:15:56.134484 134808649512768 verifier.py:303] Verifying the reauthored model with input IDs: [1, 2, 3, 4]
I0115 16:15:56.139662 134808649512768 verifier.py:206] Forwarding the original model...
I0115 16:16:00.748419 134808649512768 verifier.py:209] logits_original: tensor([ 9.2015,  9.3586, 14.1001,  ..., -1.4927, -1.4921, -1.4925],
       grad_fn=<SliceBackward0>)
I0115 16:16:00.758704 134808649512768 verifier.py:211] Forwarding the reauthored model...
I0115 16:16:04.469282 134808649512768 verifier.py:214] logits_reauthored: tensor([ 9.1995,  9.3627, 14.1132,  ..., -1.4958, -1.4952, -1.4955])
E0115 16:16:04.476180 134808649512768 verifier.py:309] *** FAILED *** verify with input IDs: [1, 2, 3, 4]

fail.

In the code(verify.py), it use checkpoint file from online.

  original_model = transformers.AutoModelForCausalLM.from_pretrained(checkpoint)
  # Locate the cached dir.
  cached_config_file = transformers.utils.cached_file(
      checkpoint, transformers.utils.CONFIG_NAME
  )
  reauthored_checkpoint = pathlib.Path(cached_config_file).parent

I don't know why it failed...

haozha111 · 2025-01-15T19:24:03Z

hi, could you update your ai-edge-torch nightly version by this command:

pip install ai-edge-torch-nightly

dlgktjr · 2025-01-16T06:48:30Z

I already have installed ai-edge-torch nightly version.

>>> ai_edge_torch.__version__
'0.3.0.dev20250105'

dlgktjr added the type:bug Bug label Jan 10, 2025

pkgoogle assigned haozha111 Jan 15, 2025

pkgoogle added the status:awaiting ai-edge-developer label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama llminference result value is strange #461

llama llminference result value is strange #461

dlgktjr commented Jan 10, 2025 •

edited

Loading

haozha111 commented Jan 14, 2025

dlgktjr commented Jan 15, 2025 •

edited

Loading

haozha111 commented Jan 15, 2025

dlgktjr commented Jan 16, 2025

llama llminference result value is strange #461

llama llminference result value is strange #461

Comments

dlgktjr commented Jan 10, 2025 • edited Loading

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

haozha111 commented Jan 14, 2025

dlgktjr commented Jan 15, 2025 • edited Loading

haozha111 commented Jan 15, 2025

dlgktjr commented Jan 16, 2025

dlgktjr commented Jan 10, 2025 •

edited

Loading

dlgktjr commented Jan 15, 2025 •

edited

Loading