Skip to content

Latest commit

 

History

History
41 lines (26 loc) · 1.15 KB

README.md

File metadata and controls

41 lines (26 loc) · 1.15 KB

Minimal example of running LLama with Kubernetes

Uses the ollama/ollama image from Docker Hub.

Prerequisites

  • Running Kubernetes cluster
  • kubectl installed and configured
  • helm installed
  • ingress-nginx installed in the cluster
  • A local container registry running at http://localhost:5000
  • Python 3.8 or later

You can find an example using kind here: Kubernetes and You

Execute the demo

  • Apply the helm chart
helm upgrade --install ollama ./ollama -f ./ollama/values.yaml -n ollama
  • Execute the example Python script to start a Flask server
pip install -r example-app/requirements.txt
python ./example-app/app.py
  • Visit http://localhost:8999 in your browser to receive a motivational llama message.

RAG Pipeline

An example real time RAG pipeline is provided. We use Redis as a vector database and document cache.

Pre-requisites

Assuming you have the provided kind cluster running locally. You can use the following to install a redis Helm chart:

helm -n redis install redis oci://registry-1.docker.io/bitnamicharts/redis --create-namespace