Skip to content

πŸš€ A simple RunPod worker for aphrodite-engine with support for network volumes

License

Notifications You must be signed in to change notification settings

joachimchauvet/worker-aphrodite-engine

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Aphrodite Engine | RunPod Worker

πŸš€ | A RunPod worker for the Aphrodite Engine, enabling efficient text generation and processing.

πŸ“– | Getting Started

This worker runs the Aphrodite Engine on RunPod Serverless, allowing for efficient text generation and processing. To set up the worker on RunPod, follow these steps:

  1. Go to the RunPod dashboard and create a new serverless template.
  2. Input your container image or use the joachimchauvet/worker-aphrodite-engine:latest pre-built image from DockerHub.
  3. Select your desired GPU and other hardware specifications.
  4. Set the environment variables as needed (see below).
  5. Deploy a serverless endpoint using the template.

πŸ”§ | Environment Variables

The following environment variables can be set to configure the Aphrodite Engine:

  • DOWNLOAD_DIR: Directory to download the model (recommended: "/runpod-volume", see below)
  • MODEL or MODEL_NAME (required): Name or path of the Hugging Face model to use
  • REVISION: Specific model version to use (branch, tag, or commit ID)
  • DATATYPE: Data type to use (auto, float16, bfloat16, float32)
  • KVCACHE: KV cache data type
  • MAX_MODEL_LEN or CONTEXT_LENGTH: Model context size
  • NUM_GPUS: Number of GPUs for tensor parallelism
  • GPU_MEMORY_UTILIZATION: GPU memory utilization factor
  • QUANTIZATION: Quantization method
  • ENFORCE_EAGER: If set, disables CUDA graphs
  • KOBOLD_API: If set, launches the Kobold API
  • CMD_ADDITIONAL_ARGUMENTS: Any additional command-line arguments

πŸ’Ύ | Using a Network Volume

It's recommended to use a network volume for model storage. To do this:

  1. Create a network volume in your RunPod account.
  2. When deploying the pod, attach the network volume.
  3. Set the DOWNLOAD_DIR environment variable to "/runpod-volume".

This ensures that your models are persistently stored and can be reused across deployments.

Example Inputs

Regular Completions

{
  "input": {
    "prompt": "Once upon a time",
    "sampling_params": {
      "max_tokens": 400,
      "temperature": 0.7
    }
  }
}

Chat Completions

{
  "input": {
    "messages": [{ "role": "user", "content": "Hello" }],
    "sampling_params": {
      "max_tokens": 100,
      "temperature": 0.7
    }
  }
}

πŸ”— | Links

πŸ“š Aphrodite Engine πŸš€ RunPod (affiliate link)

About

πŸš€ A simple RunPod worker for aphrodite-engine with support for network volumes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 45.7%
  • Dockerfile 27.9%
  • Shell 26.4%