Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llproxy/spec/llproxy.md at main · the-crypt-keeper/llproxy #867

Open
1 task
ShellLM opened this issue Aug 3, 2024 · 1 comment
Open
1 task

llproxy/spec/llproxy.md at main · the-crypt-keeper/llproxy #867

ShellLM opened this issue Aug 3, 2024 · 1 comment
Labels
AI-Agents Autonomous AI agents using LLMs AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models hosting-services llm model hosting services llm Large Language Models llm-applications Topics related to practical applications of Large Language Models in various fields llm-inference-engines Software to run inference on large language models software-engineering Best practice for software engineering source-code Code snippets

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 3, 2024

llproxy/spec/llproxy.md at main · the-crypt-keeper/llproxy

Snippet

"Let's create a new NodeJS project called LLProxy.

The goal of this project will be a self-configuring proxy that discovers large language model servers and route requests to them in a fleixble way.

Configuration

A json configuration file will be provided in the following format:

{
    'port': <proxy listen port>
    'endpoints': [
        {
            'hostname': <endpoint hostname or IP>
            'port_start': <first port a server may be listening on>
            'port_end': <last port a serer may be listening on (inlusive)>
            'tags': [<optional list of tags to apply to models from this endpoint>]
        },
        ...
    ]
}

Data Structures

  • Configuration
  • Active list of models

Endpoints

All endpoints start in /v1 to be compliant with open-ai API specification.

The /v1/models endpoint will be special and handled by the proxy itself.

The /v1/completions and /v1/chat/completions endpoints will use the 'model' field of the request body to route the request (more on that later)

Model Discovery

Every 30 seconds, the following process should run in a background task and rebuild the active list of models:

  • Iterate through all endpoints
  • Try to make a call to all ports in the range [port_start, port_end] on the /v1/models endpoint
  • The /v1/models endpoint, if successfull, returns a payload like this:
{"object":"list","data":[{"id":"./Meta-Llama-3.1-70B-Instruct-IQ3_XS.gguf","object":"model","created":1722606296,"owned_by":"llamacpp","meta":{"vocab_type":2,"n_vocab":128256,"n_ctx_train":131072,"n_embd":8192,"n_params":70553711616,"size":29299888128}}]}
  • The 'id' key of the data list is the most important piece of information returned, you will need this value to route the request
  • We also need a cleaned-up version of this field for the user to make requests, lets call that 'name' and strip any file paths and extensions from 'id'
  • Finally, apply the tags from this endpoint to create the final model names in the form name:tag
  • Save model details, host, port, id and final model names to the active model list

Emit appropriate debug information as the process runs and a summary when it is complete.

Model Endpoint

The /v1/models endpoint should return the current active list of models. Our 'id' is the name:tag final model name from above, but pass through all other model information from the endpoint itself.

Completions endpoints

/v1/completions and /v1/chat/completions proxies should look at 'model' which will be the name:tag and use it to look-up in the active list of models the real endpoint and model id.

It should overwrite the 'model' with the true model id and then proxy the request to the backend endpoint.

Note that HTTP response streaming is supported on both of these endpoints, so the proxy should be non-blocking and asyncronous."

Suggested labels

None

FragmentEditor

We are going to be creating a new React application called FragmentEditor.

The purpose of the application is to help a user write, edit and revise stories, documents, emails or any other text quickly with keyboard shortcuts and AI assistance.

Use create-react-app to generate the scaffolding, then provide all the required changes to the source code to implement the requirements below.

Concepts

Fragment: A small piece of text, usually but not necessarily a sentence
Fragment list: A list of fragments that when joined together forms the current document.
Selected fragment: The index of the currently selected fragment in the list. It should be possible to select a new, not-yet-created fragment at the very end of the list.
Document: The fragment list joined into a single string without introducing any additional whitespace.
Mode: The current editor mode: "explore" (default), "edit" or "insert"

The Fragment List

The main display area, the fragment list, should take up 95% of the screen in both width and height the rest should be padding.

Inside this area, render the current list of fragments. Each fragment should be rendered visually distinctly with the selected fragment highlighted using background color. Remember fragments may contain newlines, but should otherwise render tightly with adjacent fragments.

It should possible to select a new, not-yet-created fragment at the very end of the list which should render as <new> with a visually distinct background.

Keyboard Controls

In the default "explore" mode, the following keyboard controls should be available:

  • Left/right: change which fragment is selected
  • i: insert a new fragment to the right of the current fragment and transition to "insert" mode
  • space: transition to "edit" mode

In "edit" and "insert" modes, the Selected Fragment should be rendered as a textarea. If entered on the <new> fragment, the text should start empty and if the user accepts it with enter the fragment should be created and the selected fragment again moved to another <new>.

The following keyboard controls should be available in "edit" and "insert" mode:

  • ctrl+enter: insert a newline
  • enter: save changes to the fragment, return to explore mode
  • escape:
    • edit mode: discard changes, return to "explore" mode
    • insert mode: delete the inserted fragment, return to "explore" mode

Starting state

We should start in "explore" mode.
The fragment list should be empty.
The <new> virtual fragment at the end of the list should be selected.

@ShellLM ShellLM added AI-Agents Autonomous AI agents using LLMs AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models hosting-services llm model hosting services llm Large Language Models llm-applications Topics related to practical applications of Large Language Models in various fields llm-inference-engines Software to run inference on large language models software-engineering Best practice for software engineering source-code Code snippets labels Aug 3, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 3, 2024

Related content

#865 similarity score: 0.9
#632 similarity score: 0.89
#774 similarity score: 0.88
#396 similarity score: 0.88
#839 similarity score: 0.87
#160 similarity score: 0.87

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI-Agents Autonomous AI agents using LLMs AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models hosting-services llm model hosting services llm Large Language Models llm-applications Topics related to practical applications of Large Language Models in various fields llm-inference-engines Software to run inference on large language models software-engineering Best practice for software engineering source-code Code snippets
Projects
None yet
Development

No branches or pull requests

1 participant