Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for repo-level autocompletion (like in Qwen-2.5-Coder) #327

Closed
kha84 opened this issue Sep 23, 2024 · 7 comments
Closed

Support for repo-level autocompletion (like in Qwen-2.5-Coder) #327

kha84 opened this issue Sep 23, 2024 · 7 comments

Comments

@kha84
Copy link
Contributor

kha84 commented Sep 23, 2024

Is your feature request related to a problem? Please describe.
Hello there. Few days ago a new bunch of qwen models came out. The one that is particularly interesting is Qwen-2.5-Coder.
What is so special about it is that it support what is called "repository-level code completion". In short, it allows you to put the whole code of your repo in prompt like this:

<|repo_name|>{repo_name}
<|file_sep|>{file_path1} 
{file_content1}
<|file_sep|>{file_path2} 
{file_content2}

... and it will continue from the end, after {file_content2}, considering all the above context. Read more details about it here

Couple it with a good context size (like 128k) and it promises to become much more knowledgeable AI code autocompletion tool, rather than conventional fill-in-the middle models.

Describe the solution you'd like
I'd like to have an option in Twinny to feed the into the LLM prompt filenames and contents of those files, to accomodate this new appearing feature of LLM models which is "repo-level autocompletion"

Additional context
Having said all this, I realize there will be cases, where such models can fail - like if you have a lot of files in your repo, or you short on resources to accommodate the context size of that big. But to me we only can discover all the possible pitfalls after we carefully experience it ourselves.

@rjmacarthy
Copy link
Collaborator

Looks good, thanks. How is it determined which files to pass as context?

@kha84
Copy link
Contributor Author

kha84 commented Sep 23, 2024

That's the million dollar question. There're three options I can see - two are dumb, but easy to implement:

  1. let user to specify which files to send. Like to send all the files, which are currently opened by user. Or were recently edited from the same project directory.

  2. send all files from the project folder / current repo (might be ought costy)

And there's one more advanced option, but it will be tricky to implement. The idea is to equip some kind of AST parser (specific to the programming language), to figure it out, what other source code files/modules are imported/included to the one, which is currently opened by user. And send only those files which are relevant.

@kv-gits
Copy link

kv-gits commented Sep 26, 2024

I mentioned it here
#320
You may add file manually or add its syntax tree (tree sitter). Also nice feature - preview the prompt

@rjmacarthy
Copy link
Collaborator

I've been experimenting with this, the issue is that the prompt does not contain the suffix, so it will only generate code from suffix and has no knowledge of code below the cursor. It works really well, and I will make a PR but I think fill-in-middle file completion is better until this issue is addressed in the future.

@rjmacarthy
Copy link
Collaborator

Now there is a checkbox in the provider in v3.17.18 which enables/disabled repository level completions for fim providers.

@kha84
Copy link
Contributor Author

kha84 commented Sep 26, 2024

You're the star! I'll give everything a proper trial and report back with my findings / comments.

@robertpiosik
Copy link

hi @rjmacarthy what "repository level" means?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants