server : add "token healing" support #5765

CyberShadow · 2024-02-28T12:10:30Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

Hi! I am experimenting with using llama.cpp as a general-purpose code completion backend, similar to TabNine.

I am encountering a small problem: if the completion prompt ends mid-word, the results are not very accurate. For example, for a prompt such as Five, Four, Thre [sic], the model will often ignore the typo and suggest , Two (forming Thre, Two).

I think, as an option to the /completion server API, the following optional behavior would be useful:

Tokenize the text
Chop off the last token
Run the prediction with the remaining tokens, but only consider those tokens whose bytes start with the bytes of the last token.

Thanks!

The text was updated successfully, but these errors were encountered:

stduhpf · 2024-02-28T12:35:08Z

The usual name for this feature is "token healing". I agree that it would be nice to have it supported here.

ilyannn · 2024-03-06T21:47:06Z

@ggerganov I'd like to try working on it as my first issue!

ggerganov · 2024-03-07T09:52:49Z

Ok. This can be demonstrated in one of the examples. One way would be to add it to main or simple + extend llama_sampling_sample with the necessary functionality

mare5x · 2024-05-01T18:21:11Z

Hi @ilyannn, do you still want to work on this? I've created a draft PR (#7028) that demonstrates token healing, but I still haven't added it to main or server. We can collaborate on that, if you'd like.

ilyannn · 2024-05-07T11:48:58Z

@mare5x Sorry, I have not actually started so please don't wait for me. I'll try to take a look at your PR this week though and will be happy to help in any way I can.

CyberShadow added the enhancement New feature or request label Feb 28, 2024

ggerganov changed the title ~~Mid-token completion~~ server : add "token healing" support Feb 28, 2024

ggerganov added the server/webui label Feb 28, 2024

ggerganov added this to ggml : roadmap Feb 28, 2024

ggerganov moved this to Todo in ggml : roadmap Feb 28, 2024

ggerganov added the good first issue Good for newcomers label Feb 28, 2024

mare5x mentioned this issue May 1, 2024

Add token healing example #7028

Closed

mare5x mentioned this issue Jul 1, 2024

Add token healing to main and server #7187

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : add "token healing" support #5765

server : add "token healing" support #5765

CyberShadow commented Feb 28, 2024

stduhpf commented Feb 28, 2024

ilyannn commented Mar 6, 2024

ggerganov commented Mar 7, 2024

mare5x commented May 1, 2024

ilyannn commented May 7, 2024

server : add "token healing" support #5765

server : add "token healing" support #5765

Comments

CyberShadow commented Feb 28, 2024

Prerequisites

Feature Description

stduhpf commented Feb 28, 2024

ilyannn commented Mar 6, 2024

ggerganov commented Mar 7, 2024

mare5x commented May 1, 2024

ilyannn commented May 7, 2024