-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ollama-rs integration #156
Conversation
Have you tried using async_stream with |
Looking into it, thanks! |
I mapped the stream item using Just one more question, what should be the contents of EDIT: Below is the async fn stream(
&self,
messages: &[Message],
) -> Result<Pin<Box<dyn Stream<Item = Result<StreamData, LLMError>> + Send>>, LLMError> {
let request = self.generate_request(messages);
let result = self.client.send_chat_messages_stream(request).await?;
let stream = result.map(|data| match data {
Ok(data) => match data.message {
Some(message) => Ok(StreamData::new(
serde_json::to_value(message.clone()).unwrap_or_default(),
message.content,
)),
None => Err(LLMError::ContentNotFound(
"No message in response".to_string(),
)),
},
Err(_) => Err(OllamaError::from("Stream error".to_string()).into()),
});
Ok(Box::pin(stream))
} |
Value is the raw json response from the server in case the user wants to access other properties from json. Most folks will only care about the content. I'm wondering if we |
EDITED* Yea I think returning I will update the code to return the entire response as |
I'm ok changing to Option in a different PR. |
Updated, also added a TODO note for that |
I have no idea why the build is failing btw :o |
Don't know why this happens, I have to delete the actions cache once in a while and it works again |
@@ -3,3 +3,6 @@ pub use openai::*; | |||
|
|||
pub mod claude; | |||
pub use claude::*; | |||
|
|||
pub mod ollama; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these just be mod ollama;
instead of pub mod ollama;
.
Since this already exists probably can be fixed separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added nit comments, though most of these seems to have already existed so I'm ok fixing this separately. @Abraxas-365 feels free to review and merge if it looks good, else I will merge it tomorrow.
impl Default for OllamaConfig { | ||
fn default() -> Self { | ||
Self { | ||
api_key: Secret::new("ollama".to_string()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should secret here also be a constant similar to OLLAMA_API_BASE
.with_api_key("ollama"), | ||
) | ||
.with_model("llama2"); | ||
let ollama = Ollama::default().with_model("llama3"); | ||
|
||
let response = ollama.invoke("hola").await.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth saying Hello
instead of hola
.
Merged. Thanks! |
Thanks for the merge! Just saw the messages, was on road today. I plan on handling the function call PR when Ollama-rs is updated as well, and maybe we may have to change a few lines on token count calculation implemented in this PR for Ollama, nothing big though 🙏🏻 |
This PR integrates Ollama-rs and feature-gates it with the
ollama
feature.Re-Implements
OllamaEmbedder
using the new client, theembedding_ollama.rs
example is working.Implements the
language_models::llm::LLM
trait forOllama
, thellm_ollama.rs
example is working.Also allows OpenAI compatible Ollama usage by implementing
async_openai::config::Config
forOllamaConfig
, this part is not feature-gated as it does not require Ollama-rs package.Closes Pull missing model for Ollama LLM generation #148 : Initially we had talked about adding an option to auto-pull a model if it does not exist and such, but with Ollama-rs integrated, we can simply leave that to the user. The pull code is a simple one-liner using the Ollama-rs client, and our tools request the client be created from the outer scope and passed in using
Arc
. So if one needs to pull a model, they can do it before passing in the model with their own logic (e.g. with retries, timeouts, cancellations)Opens up the way for Add support for OllamaFunctions chat model from the official langchain library #149 as we will basically have access to function calls when the PR: Ollama Function Calling pepperoni21/ollama-rs#51 is merged!
Note
@prabirshrestha Im currently opening this as a draft, because I couldn't implement the
stream
function for theLLM
trait, and would like your opinion / help on that part! Ollama-rs has a streaming call itself, and I just have to map the items of this stream to the one that LLM trait wants, but for some reason I couldn't do it yet.I will still be working on that, but would love your input if any 🙏🏻
EDIT: Streaming is done.