-
Create your own Telegram bot from @BotFather and take the bot token
-
Edit the file config/telegram.json
{ "username": "BOT USERNAME", "token": "BOT TOKEN", "admins": [ "YOUR TELEGRAM ID" ] }
-
Create your own Wit token on Wit website
-
Edit the file config/wit.json (for example with italian token)
{ "it-IT": "WIT TOKEN FOR Italian" }
You can repeat the points 3 and 4 for support multiple languages.
You can test if your token is working by running:
python src/audiotools/speech.py wit_api_key some_file.mp3 transcription.txt
-
Create your own Yandex translate token on Yandex website
-
Edit the file config/yandex.json
{ "translate_key": "YOUR YANDEX TOKEN" }
We provide prebuilt images on ghcr.io. See run.sh to start a docker container with the latest release.
Altenratevely, you can build the image from the Dockerfile with build.sh
In run.sh, the docker directories config, data and values are binding with the repository directory. If you want to edit the files in the configuration directories you can do this simply by stopping the container. As soon as you finish editing the files, just restart the container to make them active.
Tested with: python 3.12.0
First, install the required dependencies (Ubuntu):
sudo apt install tesseract-ocr libtesseract-dev libleptonica-dev libpython3-dev libzbar-dev
Create a virtual environment and install the required packages:
python3 -m venv transcriber-bot
source transcriber-bot/bin/activate
pip install -r requirements.txt
Run the bot:
cd src
python3 main.py