-
-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ollama): support for running the local Ollama binary #2908
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for testcontainers-go ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty clever. I think what this is acknowledging is that LLM runtimes serve models instead of docker images. They have unique constraints in resource requirements due to GPU etc.
Rather than have projects skip testcontainers completely for ollama, this allows a bridge that makes an LLM runtime feel the same way as a docker container.
An alternative could be to host this somewhere else, but that could get annoying fast as you'd have to keep track of dep drift etc.
@stevenh after some internal discussion if using this (pragmatic) approach is the way to go, we agreed in continuing with it. So I'd appreciate if you could prioritise it over other, as Ollama is an important piece of the GenAI story for our community. PLMK if you need more context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty big one, so here's an initial pass.
* main: feat(gcloud)!: add support to seed data when using RunBigQueryContainer (testcontainers#2523) security(deps): bump golang.org/x/crypto from 0.28.0 to 0.31.0 (testcontainers#2916) chore(ci): add Github labels based on PR title (testcontainers#2914) chore(gha): Use official setup-docker-action (testcontainers#2913) chore(ci): enforce conventional commits syntax in PR titles (testcontainers#2911) feat(nats): WithConfigFile - pass a configuration file to nats server (testcontainers#2905) chore: enable implicit default logger only in testing with -v (testcontainers#2877) fix: container binds syntax (testcontainers#2899) refactor(cockroachdb): to use request driven options (testcontainers#2883)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the improvements, second pass
c.localCtx.mx.Lock() | ||
defer c.localCtx.mx.Unlock() | ||
|
||
return c.localCtx.serveCmd != nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bug: the process may not be still running
return c.localCtx.serveCmd != nil | |
return c.localCtx.serveCmd != nil && c.localCtx.serveCmd.ProcessState.Exited() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using !c.localCtx.serveCmd.ProcessState.Exited()
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW ProcessState is only set when Wait or Run is called:
// ProcessState contains information about an exited process.
// If the process was started successfully, Wait or Run will
// populate its ProcessState when the command completes.
ProcessState *os.ProcessState
But we Start to run the ollama serve
process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a successful call to start you must call wait to free resources see Cmd.Start.
return fmt.Errorf("signal ollama: %w", err) | ||
} | ||
|
||
c.localCtx.serveCmd = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I woudn't do this as it makes the logic in other places non-deterministic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having issues to get the state of the running process without calling Run or Wait, as the ollama process is a server that must be run in the background. That's why I'm delegating it to our cmd representation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned you need to call wait a some point to free resources. As you want this to run in the background I would kick off goroutine that writes the err to a channel and wait for that in Stop, switching to kill on timeout if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time I call Wait the execution get stuck in the wait stacktrace. Could you help me out with this part?
if c.localCtx.serveCmd == nil { | ||
return nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: should check for Exited first otherwise you could get an error for a stopped process, which might not be the intent?
@mdelapenya I'll create a suggested change for process management. Can't push to your fork so will push to one under testcontainers. |
What does this PR do?
This PR adds a functional option to the Ollama module to enable using the local Ollama binary instead of the Docker container.
For that, it implements the Container APIs, mapping it to the local process executino of Ollama, using the default values for ollama, such as the well-known port (11434), the host (127.0.0.1), etc.
There will be container APIs that do not make sense, like the Copy APIs, so they are NO-OPs.
If starting the local Ollama binary fails for any reason, it will fallback to the containerised version of the module.
It include tests for the local process, installing Ollama in the GH worker. This part has been abstracted to any module, so that modules needing to install something in the CI beforehand, can contribute it as part of a shell script located in a dir with the module name under
.github/scripts/modules
.Why is it important?
It will allow users to switch to the local Ollama they could have installed in their machines, getting a more performant version of the tool, as it has native access to the GPUs.
Related issues
Follow-ups
Does it make sense to have a
GenericProcess
available in the core, doing what this local-ollama container adds in this PR? 🤔