Aphrodite Engine | RunPod Worker

🚀 | A RunPod worker for the Aphrodite Engine, enabling efficient text generation and processing.

📖 | Getting Started

This worker runs the Aphrodite Engine on RunPod Serverless, allowing for efficient text generation and processing. To set up the worker on RunPod, follow these steps:

Go to the RunPod dashboard and create a new serverless template.
Input your container image or use the joachimchauvet/worker-aphrodite-engine:latest pre-built image from DockerHub.
Select your desired GPU and other hardware specifications.
Set the environment variables as needed (see below).
Deploy a serverless endpoint using the template.

🔧 | Environment Variables

The following environment variables can be set to configure the Aphrodite Engine:

DOWNLOAD_DIR: Directory to download the model (recommended: "/runpod-volume", see below)
MODEL or MODEL_NAME (required): Name or path of the Hugging Face model to use
REVISION: Specific model version to use (branch, tag, or commit ID)
DATATYPE: Data type to use (auto, float16, bfloat16, float32)
KVCACHE: KV cache data type
MAX_MODEL_LEN or CONTEXT_LENGTH: Model context size
NUM_GPUS: Number of GPUs for tensor parallelism
GPU_MEMORY_UTILIZATION: GPU memory utilization factor
QUANTIZATION: Quantization method
ENFORCE_EAGER: If set, disables CUDA graphs
KOBOLD_API: If set, launches the Kobold API
CMD_ADDITIONAL_ARGUMENTS: Any additional command-line arguments

💾 | Using a Network Volume

It's recommended to use a network volume for model storage. To do this:

Create a network volume in your RunPod account.
When deploying the pod, attach the network volume.
Set the DOWNLOAD_DIR environment variable to "/runpod-volume".

This ensures that your models are persistently stored and can be reused across deployments.

Example Inputs

Regular Completions

{
  "input": {
    "prompt": "Once upon a time",
    "sampling_params": {
      "max_tokens": 400,
      "temperature": 0.7
    }
  }
}

Chat Completions

{
  "input": {
    "messages": [{ "role": "user", "content": "Hello" }],
    "sampling_params": {
      "max_tokens": 100,
      "temperature": 0.7
    }
  }
}

🔗 | Links

📚 Aphrodite Engine 🚀 RunPod (affiliate link)

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
builder		builder
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aphrodite Engine | RunPod Worker

📖 | Getting Started

🔧 | Environment Variables

💾 | Using a Network Volume

Example Inputs

Regular Completions

Chat Completions

🔗 | Links

About

Releases

Packages

Languages

License

joachimchauvet/worker-aphrodite-engine

Folders and files

Latest commit

History

Repository files navigation

Aphrodite Engine | RunPod Worker

📖 | Getting Started

🔧 | Environment Variables

💾 | Using a Network Volume

Example Inputs

Regular Completions

Chat Completions

🔗 | Links

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages