Skip to content

Commit

Permalink
Merge pull request #193 from rjmacarthy/development
Browse files Browse the repository at this point in the history
Updates to support fully configurable api, enabled support for LiteLLM
  • Loading branch information
rjmacarthy authored Mar 28, 2024
2 parents f560752 + f3b571c commit c1fc91c
Show file tree
Hide file tree
Showing 23 changed files with 353 additions and 273 deletions.
38 changes: 18 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,13 @@ Are you fed up of all of those so called "free" Copilot alternatives with paywal
Twinny is the most no-nonsense locally hosted (or api hosted) AI code completion plugin for **Visual Studio Code** and any compatible editors (like VSCodium) designed to work seamlessly with:

- [Ollama](https://github.com/jmorganca/ollama)
- [Ollama Web UI](https://github.com/ollama-webui/ollama-webui)
- [llama.cpp](https://github.com/ggerganov/llama.cpp)
- [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui)
- [LM Studio](https://github.com/lmstudio-ai)

Like Github Copilot but 100% free and private.


- [LiteLLM](https://github.com/BerriAI/litellm)
- [Ollama Web UI](https://github.com/ollama-webui/ollama-webui)
-
Like Github Copilot but 100% free!

<div align="center">
<a href="https://marketplace.visualstudio.com/items?itemName=rjmacarthy.twinny">
Expand Down Expand Up @@ -41,6 +40,9 @@ Through the side bar, have a conversation with your model and get explanations a

#### Other features

- Works online or offline.
- Highly configurable api endpoints for fim and chat
- Conforms to the OpenAI API standard
- Single or multiline fill-in-middle completions
- Customisable prompt templates to add context to completions
- Easy installation via vscode extensions marketplace or by downloading and running a binary directly
Expand All @@ -49,14 +51,14 @@ Through the side bar, have a conversation with your model and get explanations a
- Accept code solutions directly to editor
- Create new documents from code blocks
- Copy generated code solution blocks
- Chat history preserved per workspace
- Chat history preserved per workspace

## 🚀 Getting Started

### With Ollama

1. Install the VS code extension [link](https://marketplace.visualstudio.com/items?itemName=rjmacarthy.twinny) (or if [VSCodium](https://open-vsx.org/extension/rjmacarthy/twinny))
2. Install [ollama](https://ollama.com/)
2. Twinny is configured to use Ollama by default as the backend, you can install Ollama here: [ollama](https://ollama.com/)
3. Choose your model from the [library](https://ollama.com/library) (eg: `codellama:7b`)

```sh
Expand All @@ -69,35 +71,31 @@ You should see the 🤖 icon indicating that twinny is ready to use.

5. See [Keyboard shortcuts](#keyboard-shortcuts) to start using while coding 🎉

### With llama.cpp / LM Studio / Oobabooga
### With llama.cpp / LM Studio / Oobabooga / LiteLLM or any other provider.

1. Install the VS code extension [link](https://marketplace.visualstudio.com/items?itemName=rjmacarthy.twinny) (or if [VSCodium](https://open-vsx.org/extension/rjmacarthy/twinny))
2. Get [llama.cpp](https://github.com/ggerganov/llama.cpp) / LM Studio / Oobabooga
2. Get [llama.cpp](https://github.com/ggerganov/llama.cpp) / LM Studio / Oobabooga / LiteLLM
3. Download and run the model locally using the chosen provider

4. Open VS code (if already open a restart might be needed) and press `ctr + shift + T` to open the side panel.

5. From the top ⚙️ icon open the settings page and in the `Api Provider` panel change from `ollama` to `llamacpp` (or others respectively).
6. In the left panel you should see the 🤖 icon indicating that twinny is ready to use.

5. See [Keyboard shortcuts](#keyboard-shortcuts) to start using while coding 🎉
6. Update the settings for chat provider, port and hostname etc to be the correct. Please adjust carefully for other providers.
7. In the left panel you should see the 🤖 icon indicating that twinny is ready to use.
8. See [Keyboard shortcuts](#keyboard-shortcuts) to start using while coding 🎉

### With other providers

Twinny supports the OpenAI API specification so in theory any provider should work as long as it supports the specification.

If you find that isn't the case please [open an issue](https://github.com/rjmacarthy/twinny/issues/new/choose) with details of how you are having problems.
Twinny supports the OpenAI API specification so in theory any provider should work as long as it supports the specification.

The easiest way to use OpenAI API through twinny is to use LiteLLM as your procvider as a local proxy, it works seamlessly if configured correctly.

If you find that isn't the case please [open an issue](https://github.com/rjmacarthy/twinny/issues/new/choose) with details of how you are having problems.

#### Note!

When choosing an API provider the port and API path names will be updated automatically based on the provider you choose to use. These options can also be set manually.

The option for chat model name and fim model name are only applicable to Ollama and Oobabooga providers.



## Model support

Twinny works with any model as long as it can run on your machine and it exposes a OpenAI API compliant endpoint.
Expand All @@ -114,7 +112,6 @@ All instruct models should work for chat generations, but the templates might ne

- For computers with a good GPU, use: `deepseek-coder:6.7b-base-q5_K_M` (or any other good instruct model).


### **Models for FIM (fill in the middle) completions**

For FIM completions, you need to use LLM models called "base models". Unlike instruct models, base models will only try to complete your prompt. They are not designed to answer questions.
Expand Down Expand Up @@ -145,6 +142,7 @@ In the settings there is an option called `useFileContext` this will keep track
- Sometimes a restart of vscode is required for new settings to take effect, please open an issue if you are having problems with this.
- Using file context often causes unreliable completions for FIM because small models get confused when provided with more than one file context.
- See open issues on github to see any known issues that are not yet fixed.
- LiteLLM fim template needs invetigation


If you have a problem with Twinny or have any suggestions please report them on github issues. Please include your vscode version and OS details in your issue.
Expand Down
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

79 changes: 50 additions & 29 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"name": "twinny",
"displayName": "twinny - AI Code Completion and Chat",
"description": "Locally hosted AI code completion plugin for vscode",
"version": "3.9.3",
"version": "3.10.0",
"icon": "assets/icon.png",
"keywords": [
"code-inference",
Expand Down Expand Up @@ -228,105 +228,126 @@
"twinny.apiHostname": {
"order": 1,
"type": "string",
"default": "localhost",
"description": "Hostname for the completion API.",
"default": "0.0.0.0",
"description": "Hostname for chat completion API.",
"required": true
},
"twinny.apiProvider": {
"twinny.apiFimHostname": {
"order": 2,
"type": "string",
"default": "0.0.0.0",
"description": "Hostname for FIM completion API.",
"required": true
},
"twinny.apiProvider": {
"order": 3,
"type": "string",
"enum": [
"ollama",
"llamacpp",
"lmstudio",
"oobabooga",
"other"
"litellm"
],
"default": "ollama",
"description": "The API provider to use (sets the paths and port automatically to defaults)."
"description": "API Chat provider."
},
"twinny.apiProviderFim": {
"order": 4,
"type": "string",
"enum": [
"ollama",
"llamacpp",
"lmstudio",
"oobabooga",
"litellm"
],
"default": "ollama",
"description": "API FIM provider."
},
"twinny.chatApiPath": {
"order": 3,
"order": 5,
"type": "string",
"default": "/v1/chat/completions",
"description": "Endpoint path for chat completions.",
"required": true
},
"twinny.chatApiPort": {
"order": 4,
"order": 6,
"type": "number",
"default": 11434,
"description": "The API port usually `11434` for Ollama and `8080` for llama.cpp (May differ depending on API configuration)",
"required": true
},
"twinny.fimApiPort": {
"order": 5,
"order": 7,
"type": "number",
"default": 11434,
"description": "The API port usually `11434` for Ollama and `8080` for llama.cpp (May differ depending on API configuration)",
"required": true
},
"twinny.fimApiPath": {
"order": 6,
"order": 8,
"type": "string",
"default": "/api/generate",
"description": "Endpoint path for FIM completions.",
"required": true
},
"twinny.chatModelName": {
"order": 7,
"order": 9,
"type": "string",
"default": "codellama:7b-instruct",
"description": "Model identifier for chat completions. Applicable only for Ollama and Oobabooga API."
},
"twinny.fimModelName": {
"order": 8,
"order": 10,
"type": "string",
"default": "codellama:7b-code",
"description": "Model identifier for FIM completions. Applicable only for Ollama and Oobabooga API."
},
"twinny.fimTemplateFormat": {
"order": 9,
"order": 11,
"type": "string",
"enum": [
"automatic",
"stable-code",
"codellama",
"deepseek",
"starcoder"
"starcoder",
"custom-template"
],
"default": "automatic",
"description": "The prompt format to be used for FIM completions. Overrides automatic detection."
},
"twinny.disableAutoSuggest": {
"order": 10,
"order": 12,
"type": "boolean",
"default": false,
"description": "Disables automatic suggestions, manual trigger (default shortcut Alt+\\)."
},
"twinny.contextLength": {
"order": 11,
"order": 13,
"type": "number",
"default": 100,
"description": "Defines the number of lines before and after the current line to include in FIM prompts.",
"required": true
},
"twinny.debounceWait": {
"order": 12,
"order": 14,
"type": "number",
"default": 300,
"description": "Delay in milliseconds before triggering the next completion.",
"required": true
},
"twinny.temperature": {
"order": 13,
"order": 15,
"type": "number",
"default": 0.2,
"description": "Sets the model's creativity level (temperature) for generating completions.",
"required": true
},
"twinny.useMultiLineCompletions": {
"order": 14,
"order": 16,
"type": "boolean",
"default": true,
"description": "Use multiline completions"
Expand All @@ -335,63 +356,63 @@
"dependencies": {
"twinny.useMultiLineCompletions": true
},
"order": 15,
"order": 17,
"type": "number",
"default": 20,
"description": "Maximum number of lines to use for multi line completions. Applicable only when useMultiLineCompletions is enabled."
},
"twinny.useFileContext": {
"order": 16,
"order": 18,
"type": "boolean",
"default": false,
"description": "Enables scanning of neighbouring documents to enhance completion prompts. (Experimental)"
},
"twinny.enableCompletionCache": {
"order": 17,
"order": 19,
"type": "boolean",
"default": false,
"description": "Caches FIM completions for identical prompts to enhance performance."
},
"twinny.numPredictChat": {
"order": 18,
"order": 20,
"type": "number",
"default": 512,
"description": "Maximum token limit for chat completions.",
"required": true
},
"twinny.numPredictFim": {
"order": 19,
"order": 21,
"type": "number",
"default": 512,
"description": "Maximum token limit for FIM completions. Set to -1 for no limit. Twinny should stop at logical line breaks.",
"required": true
},
"twinny.enableSubsequentCompletions": {
"order": 20,
"order": 22,
"type": "boolean",
"default": true,
"description": "Enable this setting to allow twinny to keep making subsequent completion requests to the API after the last completion request was accepted."
},
"twinny.keepAlive": {
"order": 21,
"order": 23,
"type": "string",
"default": "5m",
"description": "Keep models in memory by making requests with keep_alive=-1. Applicable only for Ollama API."
},
"twinny.useTls": {
"order": 22,
"order": 24,
"type": "boolean",
"default": false,
"description": "Enables TLS encryption for API connections."
},
"twinny.apiBearerToken": {
"order": 23,
"order": 25,
"type": "string",
"default": "",
"description": "Bearer token for secure API authentication."
},
"twinny.enableLogging": {
"order": 24,
"order": 26,
"type": "boolean",
"default": true,
"description": "Enable twinny debug mode"
Expand Down
Loading

0 comments on commit c1fc91c

Please sign in to comment.