Add token healing example #7028

mare5x · 2024-05-01T18:20:21Z

Added an example that extends simple with different token healing strategies. Check the added README for examples.
Token healing works by chopping off some tokens from the tokenized prompt and then constraining the decoding to match the bytes of the removed tokens.

Currently, I am just using for loops for prefix searching, but performance could potentially be improved with a prefix tree (+ caching mentioned in https://arxiv.org/abs/2403.08688).

To finish #5765, we still need to include token healing into server. I think the approach is to extend llama_sampling_sample and modify the initial input tokens?

teleprint-me · 2024-05-02T02:24:34Z

This is awesome! <3

mare5x · 2024-05-09T21:36:57Z

Adding to main in #7187. Will try to add to server later.

mare5x added 2 commits May 1, 2024 20:05

examples : add simple token healing example

c77bb32

examples : more roll back options for token healing

88ef908

mare5x mentioned this pull request May 1, 2024

server : add "token healing" support #5765

Open

4 tasks

mare5x added 4 commits May 3, 2024 13:53

main : first attempt at token healing in main

951b659

main : better token healing support for interactive mode

7d0cc78

main : skip printing token healing prefix twice

d4cbccb

main : small token healing cleanup

7b6fdc2

mofosyne added examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level labels May 9, 2024

mare5x closed this May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add token healing example #7028

Add token healing example #7028

mare5x commented May 1, 2024

teleprint-me commented May 2, 2024

mare5x commented May 9, 2024

Add token healing example #7028

Add token healing example #7028

Conversation

mare5x commented May 1, 2024

teleprint-me commented May 2, 2024

mare5x commented May 9, 2024