An elegant Python application that generates detailed image captions using Meta's Llama 3.2 90B Vision model through the OpenRouter API.
- 🖼️ Simple drag-and-drop image upload interface
- 🔄 Choose between short and detailed captions
- 🤖 Powered by Meta-Llama 3.2 90B Vision Instruct model
- 🌐 Easy-to-use Gradio web interface
- ⚡ Fast and accurate image analysis
- Python 3.7 or higher
- OpenRouter API key (get it from OpenRouter)
- Internet connection
git clone https://github.com/PierrunoYT/llama-image-captioner.git
cd llama-image-captioner
Choose your operating system:
# Create a virtual environment
python -m venv venv
# Activate the environment
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create a virtual environment
python3 -m venv venv
# Activate the environment
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
You can set up your environment variables in two ways:
- Copy the example environment file:
cp .env.example .env
- Edit the
.env
file and replace the values with your actual configuration:OPENROUTER_API_KEY=your_api_key_here YOUR_SITE_URL=http://localhost:7860 YOUR_APP_NAME=Llama Image Captioner
setx OPENROUTER_API_KEY "your_api_key_here"
setx YOUR_SITE_URL "http://localhost:7860"
setx YOUR_APP_NAME "Llama Image Captioner"
[System.Environment]::SetEnvironmentVariable('OPENROUTER_API_KEY', 'your_api_key_here', 'User')
[System.Environment]::SetEnvironmentVariable('YOUR_SITE_URL', 'http://localhost:7860', 'User')
[System.Environment]::SetEnvironmentVariable('YOUR_APP_NAME', 'Llama Image Captioner', 'User')
Add these lines to your ~/.bashrc
, ~/.zshrc
, or equivalent:
export OPENROUTER_API_KEY="your_api_key_here"
export YOUR_SITE_URL="http://localhost:7860"
export YOUR_APP_NAME="Llama Image Captioner"
Then reload your shell configuration:
source ~/.bashrc # or source ~/.zshrc
-
Make sure your virtual environment is activated:
- Windows:
venv\Scripts\activate
- macOS/Linux:
source venv/bin/activate
- Windows:
-
Start the application:
python ImageCaption.py
- Open your web browser and navigate to:
- Local URL: http://localhost:7860
- The interface will open automatically in your default browser
-
Upload an image using one of these methods:
- Drag and drop an image into the upload area
- Click the upload area to select an image from your files
- Paste an image from your clipboard
-
Select caption length:
- Short: Brief, concise description
- Long: Detailed analysis of the image
-
Click "Submit" and wait for the caption to be generated
-
API Key Error
- Ensure you've set the environment variables correctly
- Restart your terminal/command prompt after setting environment variables
- Check if your API key is valid
-
Import Errors
- Verify that your virtual environment is activated
- Reinstall dependencies:
pip install -r requirements.txt
-
Connection Issues
- Check your internet connection
- Verify that OpenRouter API is accessible from your network
- Fork the repository
- Create your feature branch:
git checkout -b feature/AmazingFeature
- Commit your changes:
git commit -m 'Add some AmazingFeature'
- Push to the branch:
git push origin feature/AmazingFeature
- Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Copyright (c) 2025 PierrunoYT