xTTSVOICEMODEL.MD

NVIDIA GPU: Required for optimal performance.
Python Environment: Python 3.8 or newer installed.
Internet Connection: To download the necessary model files.

Minimum Hardware Prerequisites

Follow the instructions from github on the PC/SERVER where the TTS processing will happen.

Clone REPO
- git clone https://github.com/daswer123/xtts-api-server
- cd xtts-api-server
Create virtual env
- python -m venv venv
- venv/scripts/activate or source venv/bin/activate
Install deps
- pip install -r requirements.txt
- pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118
Launch server to ensure it works
- python -m xtts_api_server
After successfully starting the server close the server.

Open a terminal or command prompt.
Clone the XTTS API Server repository:
- git clone https://github.com/daswer123/xtts-api-server.git
Navigate to the cloned directory:
- cd xtts-api-server

Install the required Python packages:
- pip install -r requirements.txt
Download the ALL of the Model Files from Pyrater/TARS page on Hugging Face:
- config.json
- vocab.json
- model.pth
- etc...
Organize the Files
- Create a directory named tars inside the XTTS models directory. For example:
- mkdir -p /xtts-api-server/xtts_models/tars
- Place the downloaded files into the tars directory.
- Place reference.wav in the speakers folder and rename it to TARS.wav

Start the server using the following command:
- python xtts_api_server --listen --deepspeed --lowvram --model-folder "D:/AI_Tools/xtts-api-server/xtts_models" --model-source local --version tars
- Replace the --model-folder path if your XTTS models directory is located elsewhere.
Test the TARS Voice Model
- With the server running, use the XTTS API server's interface or provided scripts to input text.
- Verify that the audio output emulates TARS's voice.

Additional Resources - XTTS API Server GitHub Repository - Local Voice Cloning Using XTTS API Server - Video Tutorial