Ollama bug and improvements #686

easydatawarehousing · 2024-07-01T11:14:54Z

Temperature and seed parameters should be part of 'options'

According to the docs temperature and seed should be passed as options:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "options": {
    "seed": 101,
    "temperature": 0
  }
}'

In the current implementation these are passed at the same level as parameters like 'model'.

Changing code of Langchain::LLM::Ollama like this works, but is probably not the best place to implement this.

def chat(messages:, model: nil, **params, &block)
  parameters = chat_parameters.to_params(params.merge(messages:, model:, stream: block.present?))

  if parameters.key?(:seed) || parameters.key?(:temperature)
    parameters[:options] = {}

    if parameters.key?(:seed)
      parameters[:options][:seed] = parameters.delete(:seed)
    end

    if parameters.key?(:temperature)
      parameters[:options][:temperature] = parameters.delete(:temperature)
    end
  end

  # ...

Non-streaming response chunks should be joined before parsing?

I am using Ollama 0.1.45. When requesting a non-streaming response (i.e. not passing a block to chat method) and the response is large (more than ~4000 characters) Ollama will send multiple chunks of data.

In the current implementation each chunk is JSON.parse'd seperately. For smaller responses which fit in a single chunck this is obviously not a problem. For multiple chunks I need to join all chunks first and then JSON parse it.

Changing code of Langchain::LLM::Ollama like this works for me.

def chat(messages:, model: nil, **params, &block)
  parameters = chat_parameters.to_params(params.merge(messages:, model:, stream: block.present?))
  responses_stream = []

  if parameters[:stream]
    # Existing code
    client.post("api/chat", parameters) do |req|
      req.options.on_data = json_responses_chunk_handler do |parsed_chunk|
        responses_stream << parsed_chunk

        block&.call(OllamaResponse.new(parsed_chunk, model: parameters[:model]))
      end
    end

    generate_final_chat_completion_response(responses_stream, parameters)
    # /Existing code
  else
    client.post("api/chat", parameters) do |req|
      req.options.on_data = proc do |chunk, _size, _env|
        puts "RECEIVED #{_size} CHARS, LAST CHAR IS: '#{chunk[-1]}'" # DEBUG
        responses_stream << chunk
      end
    end

    OllamaResponse.new(
      {
        "message" => {
          "role"    => "assistant",
          "content" => JSON.parse(responses_stream.join).dig("message", "content")
        }
      },
      model: parameters[:model]
    )
  end
end

Ollama docs say nothing about this behavior. Might be a bug in Ollama. Or a feature.
This happens at least with llama3-8b-q8 and phi3-14b-q5 models.
Should langchainrb code around this? Checking if response chunks are complete JSON documents or not.

Inherit from Langchain::LLM::OpenAI ?

Since Ollama is compatible with OpenAI's API, isn't it easier to let Langchain::LLM::Ollama inherit from Langchain::LLM::OpenAI ? Overwriting default values where needed.

The text was updated successfully, but these errors were encountered:

Fodoj · 2024-08-29T18:18:55Z

I confirm the bug with chunks, once too big of an input is sent, I get ``parse': 451: unexpected token at '' (JSON::ParserError)`, simply because the chunk ends in a way that the line is not a valid json.

siberas · 2025-01-16T16:56:36Z

We are encountering the same error and would like to suggest a possible improvement. Would it be feasible to handle the response in a single step to avoid this issue?

Additionally, we believe this problem warrants being classified as a bug for better visibility and prioritization.

easydatawarehousing · 2025-01-17T07:57:07Z

At the time of posting this I was not sure the chunked responses was a bug, could have been just my version of ollama. Since more people are encountering this it probably is a bug in this gem. I have changed the title of this issue.

My way of 'fixing' this bug was to drop use of this gem altogether ...

easydatawarehousing added the enhancement New feature or request label Jul 1, 2024

easydatawarehousing changed the title ~~Ollama improvements~~ Ollama bug and improvements Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama bug and improvements #686

Ollama bug and improvements #686

easydatawarehousing commented Jul 1, 2024

Fodoj commented Aug 29, 2024

siberas commented Jan 16, 2025

easydatawarehousing commented Jan 17, 2025

Ollama bug and improvements #686

Ollama bug and improvements #686

Comments

easydatawarehousing commented Jul 1, 2024

Temperature and seed parameters should be part of 'options'

Non-streaming response chunks should be joined before parsing?

Inherit from Langchain::LLM::OpenAI ?

Fodoj commented Aug 29, 2024

siberas commented Jan 16, 2025

easydatawarehousing commented Jan 17, 2025