ENVSETUP.md

Minimum Hardware Prerequisites

Raspberry Pi: For obvious reasons.
USB Microphone: For user input.
Speaker: For TARS output.

Environment Setup Guide (IN DEVELOPMENT)

1. Set Up the TARS-AI Repository on Raspberry Pi

A. Clone the Repository

Open a terminal on your Raspberry Pi.

Clone the TARS-AI repository:

git clone https://github.com/pyrater/TARS-AI.git

Navigate to the cloned directory:
```
cd TARS-AI
```

B. Install System-Level Dependencies

These dependencies are required for various operations, including Selenium-based automation, audio processing, and format handling.

Update Your System: Ensure your package lists and installed software are up to date:
```
sudo apt update
sudo apt upgrade -y
```
Install Chromium: Chromium is the browser required for Selenium-based web automation:
```
sudo apt install -y chromium-browser
```
Install Chromedriver for Selenium: Chromedriver allows Selenium to control Chromium:
```
sudo apt install -y chromium-chromedriver
```
Install SoX and Format Support Libraries: SoX is a command-line tool for processing audio files.
```
sudo apt install -y sox libsox-fmt-all
```
Install PortAudio Development Libraries: PortAudio is a cross-platform audio input/output library.
```
sudo apt install -y portaudio19-dev
```
Verify Installations: Confirm that the installed packages are functioning:
- Check Chromium version:
```
chromium-browser --version
```
- Check Chromedriver version:
```
chromedriver --version
```
- Check SoX version:
```
sox --version
```

C. Set Up the Python Environment

Create a virtual environment:
```
python3 -m venv venv
```
Activate the virtual environment:
```
source venv/bin/activate
```
Install the required dependencies under src/:
```
pip install -r requirements.txt
```

D. Connect Hardware

Connect your microphone to the Raspberry Pi via USB.
Connect your speaker to the Raspberry Pi using the audio output or Bluetooth.

E. Set the API Key in a `.env` File (Recommended for Secure Key Management)

Create a .env file at the root of your repository based on the pre-existing .env.template file to store your API keys for your LLM and TTS service.

.env Template: Add the following lines to your .env file. Replace your-actual-api-key with your actual API key for the desired service:

# LLM
OPENAI_API_KEY="your-actual-openai-api-key"
OOBA_API_KEY="your-actual-ooba-api-key"
TABBY_API_KEY="your-actual-tabby-api-key"

# TTS
AZURE_API_KEY="your-actual-azure-api-key"

Set up an OpenAI API Key (very small cost) - OpenAI API Key
Set up an Azure Speech API Key (FREE) - Azure Speech API Key
- Make sure to create a Free Azure account Free Azure Signup
- Follow all the steps in the video up to Install Azure speech Python package.

F. Set the config.ini Parameters

Create a config.ini file in the src/ folder based on the pre-existing config.ini.template file.

Locate the [LLM] section and update the parameters (for OpenAI):

[LLM] # Large Language Model configuration (ooba/OAI or tabby)
llm_backend = openai
# Set this to `openai` if using OpenAI models.
base_url = https://api.openai.com
# The URL for the OpenAI API.
openai_model = gpt-4o-mini
# Specify the OpenAI model to use (e.g., gpt-4o-mini or another supported model).

Locate the [TTS] section and update the parameters:

[TTS] # Text-to-Speech configuration 
ttsoption = azure
# TTS backend option: [azure, local, xttsv2, TARS]
azure_region = eastus
# Azure region for Azure TTS (e.g., eastus)
...
tts_voice = en-US-Steffan:DragonHDLatestNeural
# Name of the cloned voice to use (e.g., TARS2)

tts_voice: You can find other voices available with Azure here.
- If en-US-Steffan:DragonHDLatestNeural gives you an error, try en-US-SteffanNeural.

G. Run the Program

Navigate to the src/ folder within the repository:
```
cd src/
```
Start the application:
```
python app.py
```
The program should now be running and ready to interact using your microphone and speaker.

(OPTIONAL) Set up XTTS Server

1. Prepare Your PC with NVIDIA GPU

The TTS server must run on your GPU-enabled PC due to its computational requirements.

Ensure Python 3.9-3.12 is installed on your PC.
Install CUDA and cuDNN compatible with your NVIDIA GPU - CUDA Installation
Install PyTorch compatible with your CUDA and cuDNN versions - PyTorch Installation

2. Set Up XTTS API Server

A. Install XTTS API Server

Run the following command on your GPU-enabled PC to clone the XTTS API Server repository:

git clone https://github.com/daswer123/xtts-api-server.git

Follow the installation guide for your operating system:

Windows:
1. Create and activate a virtual environment:
```
python -m venv venv
venv\Scripts\activate
```
2. Install xtts-api-server:
```
pip install xtts-api-server
```

For more details, refer to the official XTTS API Server Installation Guide.

B. Add the TARS.wav Speaker File

Download the TARS-short.wav and TARS-long.wav files from the TARS-AI repository under src/tts/wakewords /VoiceClones. These will be the different voices you can use for TARS.
Place it in the speakers/ directory within the XTTS project folder. If the directory does not exist, create it.

C. Start the XTTS API Server

Open a terminal in the xtts-api-server project directory.
Activate your virtual environment if not already active:

Start the XTTS API Server:

python -m xtts_api_server --listen --port 8020

Once the server is running, open a browser and navigate to:
```
http://localhost:8020/docs
```
This will open the API's Swagger documentation interface, which you can use to test the server and its endpoints.

D. Verify the Server

Locate the GET /speakers endpoint in the API documentation.
Click "Try it out" and then "Execute" to test the endpoint.

Ensure the response includes the TARS-Short and TARS-Long speaker files, with entries similar to:

[
  {
    "name": "TARS-Long",
    "voice_id": "TARS-Long",
    "preview_url": "http://localhost:8020/sample/TARS-Long.wav"
  },
  {
    "name": "TARS-Short",
    "voice_id": "TARS-Short",
    "preview_url": "http://localhost:8020/sample/TARS-Short.wav"
  }
]

Locate the POST /tts_to_audio endpoint in the API documentation.

Click "Try it out" and input the following JSON in the Request Body:

{
    "text": "Hello, this is TARS speaking.",
    "speaker_wav": "TARS-Short",
    "language": "en"
}

Click "Execute" to send the request.
Check the response for a generated audio file. You should see a download field where you can download and listen to the audio output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENVSETUP.md

ENVSETUP.md

ENVSETUP.md

Minimum Hardware Prerequisites

Environment Setup Guide (IN DEVELOPMENT)

1. Set Up the TARS-AI Repository on Raspberry Pi

A. Clone the Repository

B. Install System-Level Dependencies

C. Set Up the Python Environment

D. Connect Hardware

E. Set the API Key in a `.env` File (Recommended for Secure Key Management)

F. Set the config.ini Parameters

G. Run the Program

(OPTIONAL) Set up XTTS Server

1. Prepare Your PC with NVIDIA GPU

2. Set Up XTTS API Server

A. Install XTTS API Server

B. Add the TARS.wav Speaker File

C. Start the XTTS API Server

D. Verify the Server

Files

ENVSETUP.md

Latest commit

History

ENVSETUP.md

File metadata and controls

ENVSETUP.md

Minimum Hardware Prerequisites

Environment Setup Guide (IN DEVELOPMENT)

1. Set Up the TARS-AI Repository on Raspberry Pi

A. Clone the Repository

B. Install System-Level Dependencies

C. Set Up the Python Environment

D. Connect Hardware

E. Set the API Key in a .env File (Recommended for Secure Key Management)

F. Set the config.ini Parameters

G. Run the Program

(OPTIONAL) Set up XTTS Server

1. Prepare Your PC with NVIDIA GPU

2. Set Up XTTS API Server

A. Install XTTS API Server

B. Add the TARS.wav Speaker File

C. Start the XTTS API Server

D. Verify the Server

E. Set the API Key in a `.env` File (Recommended for Secure Key Management)