Ollama-rs integration #156

erhant · 2024-05-21T12:55:16Z

This PR integrates Ollama-rs and feature-gates it with the ollama feature.

Re-Implements OllamaEmbedder using the new client, the embedding_ollama.rs example is working.
Implements the language_models::llm::LLM trait for Ollama, the llm_ollama.rs example is working.
Also allows OpenAI compatible Ollama usage by implementing async_openai::config::Config for OllamaConfig, this part is not feature-gated as it does not require Ollama-rs package.
Closes Pull missing model for Ollama LLM generation #148 : Initially we had talked about adding an option to auto-pull a model if it does not exist and such, but with Ollama-rs integrated, we can simply leave that to the user. The pull code is a simple one-liner using the Ollama-rs client, and our tools request the client be created from the outer scope and passed in using Arc. So if one needs to pull a model, they can do it before passing in the model with their own logic (e.g. with retries, timeouts, cancellations)
Opens up the way for Add support for OllamaFunctions chat model from the official langchain library #149 as we will basically have access to function calls when the PR: Ollama Function Calling pepperoni21/ollama-rs#51 is merged!

Note

@prabirshrestha Im currently opening this as a draft, because I couldn't implement the stream function for the LLM trait, and would like your opinion / help on that part! Ollama-rs has a streaming call itself, and I just have to map the items of this stream to the one that LLM trait wants, but for some reason I couldn't do it yet.

I will still be working on that, but would love your input if any 🙏🏻

EDIT: Streaming is done.

prabirshrestha · 2024-05-21T19:53:46Z

Have you tried using async_stream with yield. There are some usage patterns in the codebase already such as this and this. If you have issues with threading you can use flume which can be found here.

erhant · 2024-05-22T06:53:07Z

Have you tried using async_stream with yield. There are some usage patterns in the codebase already such as this and this. If you have issues with threading you can use flume which can be found here.

Looking into it, thanks!

erhant · 2024-05-22T09:28:34Z

I mapped the stream item using map_ok and the error with map_err, and returned resulting stream with Ok(Box::pin(stream)) and it works quite nicely!

Just one more question, what should be the contents of value of StreamData? I'm looking at the codebase but its still not clear to me, should I perhaps return the things returned by Ollama other than the message content as a Value?

EDIT: Below is the stream function in its final form right now:

async fn stream(
        &self,
        messages: &[Message],
    ) -> Result<Pin<Box<dyn Stream<Item = Result<StreamData, LLMError>> + Send>>, LLMError> {
        let request = self.generate_request(messages);
        let result = self.client.send_chat_messages_stream(request).await?;

        let stream = result.map(|data| match data {
            Ok(data) => match data.message {
                Some(message) => Ok(StreamData::new(
                    serde_json::to_value(message.clone()).unwrap_or_default(),
                    message.content,
                )),
                None => Err(LLMError::ContentNotFound(
                    "No message in response".to_string(),
                )),
            },
            Err(_) => Err(OllamaError::from("Stream error".to_string()).into()),
        });

        Ok(Box::pin(stream))
    }

prabirshrestha · 2024-05-22T18:05:48Z

Value is the raw json response from the server in case the user wants to access other properties from json. Most folks will only care about the content.

I'm wondering if we Item should be Result<Option<StreamData>, LLMError>> instead of Result<StreamData, LLMError>. I had mentioned my concerns #140. Else based on the LLM provider we need to have a different error. Other option is to have the same error for all LLMs but since we are ending the call anyway I think Option is lot easier to work with.

erhant · 2024-05-22T18:56:46Z

EDITED*

Yea I think returning Option instead would be better DX instead of error handling. I guess we keep this PR as is w.r.t to the returned item, and tackle that issue in a separate PR?

I will update the code to return the entire response as Value 👍🏻

prabirshrestha · 2024-05-22T19:42:01Z

I'm ok changing to Option in a different PR.

erhant · 2024-05-23T06:54:10Z

Updated, also added a TODO note for that Option.

src/llm/ollama/client.rs

erhant · 2024-05-24T09:31:51Z

I have no idea why the build is failing btw :o

Abraxas-365 · 2024-05-24T17:44:40Z

I have no idea why the build is failing btw :o

Don't know why this happens, I have to delete the actions cache once in a while and it works again

prabirshrestha · 2024-05-25T07:15:04Z

src/llm/mod.rs

@@ -3,3 +3,6 @@ pub use openai::*;

 pub mod claude;
 pub use claude::*;
+
+pub mod ollama;


should these just be mod ollama; instead of pub mod ollama;.

Since this already exists probably can be fixed separately.

prabirshrestha

I have added nit comments, though most of these seems to have already existed so I'm ok fixing this separately. @Abraxas-365 feels free to review and merge if it looks good, else I will merge it tomorrow.

prabirshrestha · 2024-05-25T07:21:28Z

src/llm/ollama/openai.rs

+impl Default for OllamaConfig {
+    fn default() -> Self {
+        Self {
+            api_key: Secret::new("ollama".to_string()),


should secret here also be a constant similar to OLLAMA_API_BASE

prabirshrestha · 2024-05-25T07:22:34Z

examples/llm_ollama.rs

-                .with_api_key("ollama"),
-        )
-        .with_model("llama2");
+    let ollama = Ollama::default().with_model("llama3");

    let response = ollama.invoke("hola").await.unwrap();


Probably worth saying Hello instead of hola.

prabirshrestha · 2024-05-25T15:58:12Z

Merged. Thanks!

erhant · 2024-05-25T19:47:08Z

Thanks for the merge! Just saw the messages, was on road today.

I plan on handling the function call PR when Ollama-rs is updated as well, and maybe we may have to change a few lines on token count calculation implemented in this PR for Ollama, nothing big though 🙏🏻

erhant added 3 commits May 20, 2024 11:24

wrap ollama-rs with embedder

0cde5fb

ollama openai config added, todo streams

16be3c5

update example, tiny refactors

76ed658

added streaming

f6a4b44

handle unwraps in stream

a75d84f

erhant marked this pull request as ready for review May 22, 2024 12:44

StreamData value is now data in stream

75d95c4

prabirshrestha reviewed May 23, 2024

View reviewed changes

src/llm/ollama/client.rs Show resolved Hide resolved

prabirshrestha reviewed May 25, 2024

View reviewed changes

prabirshrestha approved these changes May 25, 2024

View reviewed changes

prabirshrestha merged commit a0a8078 into Abraxas-365:main May 25, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama-rs integration #156

Ollama-rs integration #156

erhant commented May 21, 2024 •

edited

Loading

prabirshrestha commented May 21, 2024

erhant commented May 22, 2024

erhant commented May 22, 2024 •

edited

Loading

prabirshrestha commented May 22, 2024

erhant commented May 22, 2024 •

edited

Loading

prabirshrestha commented May 22, 2024

erhant commented May 23, 2024

erhant commented May 24, 2024

Abraxas-365 commented May 24, 2024

prabirshrestha May 25, 2024

prabirshrestha left a comment

prabirshrestha May 25, 2024

prabirshrestha May 25, 2024

prabirshrestha commented May 25, 2024

erhant commented May 25, 2024

Ollama-rs integration #156

Ollama-rs integration #156

Conversation

erhant commented May 21, 2024 • edited Loading

prabirshrestha commented May 21, 2024

erhant commented May 22, 2024

erhant commented May 22, 2024 • edited Loading

prabirshrestha commented May 22, 2024

erhant commented May 22, 2024 • edited Loading

prabirshrestha commented May 22, 2024

erhant commented May 23, 2024

erhant commented May 24, 2024

Abraxas-365 commented May 24, 2024

prabirshrestha May 25, 2024

Choose a reason for hiding this comment

prabirshrestha left a comment

Choose a reason for hiding this comment

prabirshrestha May 25, 2024

Choose a reason for hiding this comment

prabirshrestha May 25, 2024

Choose a reason for hiding this comment

prabirshrestha commented May 25, 2024

erhant commented May 25, 2024

erhant commented May 21, 2024 •

edited

Loading

erhant commented May 22, 2024 •

edited

Loading

erhant commented May 22, 2024 •

edited

Loading