Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LM Studio partial support #384

Open
DiAifU opened this issue Feb 26, 2024 · 7 comments
Open

LM Studio partial support #384

DiAifU opened this issue Feb 26, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@DiAifU
Copy link

DiAifU commented Feb 26, 2024

What happened?

Hi,

LM Studio, with a locally running model, is working by using OpenAI with a Preset of Ollama or Lama Cpp and changing the port to the default LM Studio one (1234). The prompts are being generated but always end up with the error "Unknown API response. Code: 200, Body:".

Could you please tell me how to fix this or add support for responses being generated by LM Studio ?

Thanks for the great plugin !

Nicolas

Relevant log output or stack trace

Unknown API response. Code: 200, Body:

Steps to reproduce

  • Run LM Studio with a model locally
  • Start a server in LM Studio
  • Configure CodeGPT to target that server (OpenAI + Ollama or Lama Cpp Preset + change port to 1234)

CodeGPT version

2.4.0

Operating System

Windows

@DiAifU DiAifU added the bug Something isn't working label Feb 26, 2024
@carlrobertoh
Copy link
Owner

carlrobertoh commented Feb 27, 2024

Thank you for reporting!

The error appears to be related to the OkHttp library and how it processes the event streams. For some reason, LM Studio doesn't seem to append empty newline at the end of the final response, causing OkHttp to fail. I am not yet sure what the fix is.

@raivisdejus
Copy link

Maybe @lmstudio-ai can sort this out on their end...

@carlrobertoh
Copy link
Owner

langchain4j/langchain4j#670 related issue.

The error java.lang.IllegalArgumentException: byteCount < 0: -1 can be reproduced by removing the empty newlines from the mocked response: LocalCallbackServer.java#L111

@lucacri
Copy link

lucacri commented Apr 25, 2024

Same problem here, any suggestions on how to temporarily fix it?

@xardbaiz
Copy link

xardbaiz commented Nov 13, 2024

Hello, everyone and especially @lucacri & @DiAifU.
If it's still relevant - because LM Studio has OpenAI-like server API - CodeGPT partially supports it via "Custom OpenAI" provider, I just checked it.

Settings

1. Start LM Studio Server and remember loaded model name

image

2. Select Custom OpenAI provider in CodeGPT

image

3. Point your Custom AI provider to localhost

image
And don't forget to write correct model name loaded in LM studio
Important! For code completions you model should support FIM pattern
image


Result demo

CodeGPT screen

image

LM Studio

LM Studio logs
2024-11-13 13:55:52  [INFO]
Received POST request to /v1/chat/completions with body: {
  "stream": true,
  "model": "stable-code-instruct-3b",
  "messages": [
    {
      "role": "system",
      "content": "You are an AI programming assistant.\nFollow the user's requirements carefully & to the letter.\nYour responses should be informative and logical.\nYou should always adhere to technical information.\nIf the user asks for code or technical questions, you must provide code suggestions and adhere to technical information.\nIf the question is related to a developer, you must respond with content related to a developer.\nFirst think step-by-step - describe your plan for what to build in pseudocode, written out in great detail.\nThen output the code in a single code block.\nMinimize any other prose.\nKeep your answers short and impersonal.\nUse Markdown formatting in your answers.\nMake sure to include the programming language name at the start of the Markdown code blocks.\nAvoid wrapping the whole response in triple backticks.\nThe user works in an IDE built by JetBrains which has a concept for editors with open files, integrated unit test support, and output pane that shows the output of running the code as well as an integrated terminal.\nYou can only give one reply for each conversation turn."
    },
    {
      "role": "user",
      "content": "Please write example java MapStruct mapper"
    }
  ],
  "temperature": 0.1,
  "max_tokens": 1024
}
2024-11-13 13:55:52  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2024-11-13 13:55:52  [INFO] [LM STUDIO SERVER] Streaming response...
2024-11-13 13:55:52  [INFO] [LM STUDIO SERVER] First token generated. Continuing to stream response..
2024-11-13 13:56:02  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation... (if the model is busy processing the prompt, it will finish first))
2024-11-13 13:56:02  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-11-13 13:56:02  [INFO] Finished streaming response

@carlrobertoh
Copy link
Owner

Awesome, thank you!

We could add a preset template for it, similar to how others are done:
https://github.com/carlrobertoh/CodeGPT/blob/master/src/main/kotlin/ee/carlrobert/codegpt/settings/service/custom/template/CustomServiceTemplate.kt

@xardbaiz
Copy link

xardbaiz commented Nov 14, 2024

@carlrobertoh Ok. Let me try to find time for this. Will add. Wait for PR from mine :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants