Llama Image Captioner

An elegant Python application that generates detailed image captions using Meta's Llama 3.2 90B Vision model through the OpenRouter API.

Features

🖼️ Simple drag-and-drop image upload interface
🔄 Choose between short and detailed captions
🤖 Powered by Meta-Llama 3.2 90B Vision Instruct model
🌐 Easy-to-use Gradio web interface
⚡ Fast and accurate image analysis

Prerequisites

Python 3.7 or higher
OpenRouter API key (get it from OpenRouter)
Internet connection

Installation

1. Clone the Repository

git clone https://github.com/PierrunoYT/llama-image-captioner.git
cd llama-image-captioner

2. Set Up Python Environment

Choose your operating system:

Windows

# Create a virtual environment
python -m venv venv

# Activate the environment
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

macOS/Linux

# Create a virtual environment
python3 -m venv venv

# Activate the environment
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Configure API Keys

You can set up your environment variables in two ways:

Option 1: Using .env file (Recommended)

Copy the example environment file:
```
cp .env.example .env
```

Edit the .env file and replace the values with your actual configuration:

OPENROUTER_API_KEY=your_api_key_here
YOUR_SITE_URL=http://localhost:7860
YOUR_APP_NAME=Llama Image Captioner

Option 2: Using System Environment Variables

Windows (Command Prompt)

setx OPENROUTER_API_KEY "your_api_key_here"
setx YOUR_SITE_URL "http://localhost:7860"
setx YOUR_APP_NAME "Llama Image Captioner"

Windows (PowerShell)

[System.Environment]::SetEnvironmentVariable('OPENROUTER_API_KEY', 'your_api_key_here', 'User')
[System.Environment]::SetEnvironmentVariable('YOUR_SITE_URL', 'http://localhost:7860', 'User')
[System.Environment]::SetEnvironmentVariable('YOUR_APP_NAME', 'Llama Image Captioner', 'User')

macOS/Linux

Add these lines to your ~/.bashrc, ~/.zshrc, or equivalent:

export OPENROUTER_API_KEY="your_api_key_here"
export YOUR_SITE_URL="http://localhost:7860"
export YOUR_APP_NAME="Llama Image Captioner"

Then reload your shell configuration:

source ~/.bashrc  # or source ~/.zshrc

Running the Application

Make sure your virtual environment is activated:
- Windows: venv\Scripts\activate
- macOS/Linux: source venv/bin/activate
Start the application:

python ImageCaption.py

Open your web browser and navigate to:
- Local URL: http://localhost:7860
- The interface will open automatically in your default browser

Usage

Upload an image using one of these methods:
- Drag and drop an image into the upload area
- Click the upload area to select an image from your files
- Paste an image from your clipboard
Select caption length:
- Short: Brief, concise description
- Long: Detailed analysis of the image
Click "Submit" and wait for the caption to be generated

Troubleshooting

Common Issues

API Key Error
- Ensure you've set the environment variables correctly
- Restart your terminal/command prompt after setting environment variables
- Check if your API key is valid
Import Errors
- Verify that your virtual environment is activated
- Reinstall dependencies: pip install -r requirements.txt
Connection Issues
- Check your internet connection
- Verify that OpenRouter API is accessible from your network

Contributing

Fork the repository
Create your feature branch: git checkout -b feature/AmazingFeature
Commit your changes: git commit -m 'Add some AmazingFeature'
Push to the branch: git push origin feature/AmazingFeature
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Links

GitHub Repository
OpenRouter API
Gradio Documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Llama Image Captioner

Features

Prerequisites

Installation

1. Clone the Repository

2. Set Up Python Environment

Windows

macOS/Linux

3. Configure API Keys

Option 1: Using .env file (Recommended)

Option 2: Using System Environment Variables

Windows (Command Prompt)

Windows (PowerShell)

macOS/Linux

Running the Application

Usage

Troubleshooting

Common Issues

Contributing

License

Links

Files

README.md

Latest commit

History

README.md

File metadata and controls

Llama Image Captioner

Features

Prerequisites

Installation

1. Clone the Repository

2. Set Up Python Environment

Windows

macOS/Linux

3. Configure API Keys

Option 1: Using .env file (Recommended)

Option 2: Using System Environment Variables

Windows (Command Prompt)

Windows (PowerShell)

macOS/Linux

Running the Application

Usage

Troubleshooting

Common Issues

Contributing

License

Links