Migrate from llama_eval to llama_decode (including llama_batch usage) #19

BrutalCoding · 2024-03-03T11:18:48Z

Related to #15 #17

I'm updating aub.ai to make use the latest version of llama.cpp. This update has deprecated the llama_eval function in favor of llama_decode, which now requires the use of llama_batch. While I was aware of this upcoming change a while ago, I hadn't had the time to migrate away yet.

This issue has my highest priority, please be patient while I work out some technical difficulties.

At the time of this writing, it's the start of the evening here on a Sunday. I will continue development but I honestly do not think I am able to finish migration, compile/test for each platform and package this up for a release on pub.dev (as the aub_ai package), neither as an app (e.g. TestFlight) etc. Each step is do-able, but all together always takes quite some time without a proper CI/CD setup (sorry, comes later!). Please wait while I go through these steps, you can follow some of this work here in this branch: https://github.com/BrutalCoding/aub.ai/tree/feature/sync-with-latest-llamacpp

Challenges:

I've updated my code to use llama_decode and llama_batch, but the AI model is now outputting strange Unicode characters. This indicates an incorrect implementation on my side.

Tasks:

Review example code utilizing llama_decode and llama_batch within the llama.cpp repository or related projects.
Carefully analyze the differences between how I used llama_eval previously and the expected input/output structures for llama_decode.
Debug and adjust my code to ensure correct tokenization, batching, and handling of model output.

The text was updated successfully, but these errors were encountered:

BrutalCoding · 2024-03-07T02:22:01Z

Status update time!

Good news:

Fixed compiler issues, code been migrated to llama_decode etc 😄
Gemma works

Bad news

I lied, kinda. I did migrate the code, but I'm missing a critical step somewhere.
Assistant no longer generates an answer. The prompt does properly tokenize and the "answer" (exact same prompt / convo) gets decoded with text_to_sentence_piece too but I am missing a step somewhere.

Will jump on this project again this weekend, let's see if I can solve it.

BrutalCoding added the work in progress Issue has been recognized and work is in progress label Mar 3, 2024

BrutalCoding self-assigned this Mar 3, 2024

BrutalCoding pinned this issue Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate from llama_eval to llama_decode (including llama_batch usage) #19

Migrate from llama_eval to llama_decode (including llama_batch usage) #19

BrutalCoding commented Mar 3, 2024 •

edited

Loading

BrutalCoding commented Mar 7, 2024

Migrate from llama_eval to llama_decode (including llama_batch usage) #19

Migrate from llama_eval to llama_decode (including llama_batch usage) #19

Comments

BrutalCoding commented Mar 3, 2024 • edited Loading

BrutalCoding commented Mar 7, 2024

BrutalCoding commented Mar 3, 2024 •

edited

Loading