Marvin the Paranoid Robot

Description

From Video input to audio output. Via object detection - (yolov8, onnx format), LLM - (chatGPT, via API) and text-to-speech (fastspeech2-en-ljspeech). One can use webcam, movie files or youtube videos as input. Compatible with Mac and Windows and properly Linux.

github_sub_low.mov

Dependencies

Python

python==3.9

GPU

If you can leverage your GPU by having all CUDA dependencies installed, you can substitute onnxruntime with onnxrunntime-gpu in requirements.txt

Got it running with:

NVIDIA CUDA Driver Version 11.5
CuDNN library Version 8.3.0
For Windows: Microsoft Visual C++ (MSVC) compiler

Python packages

You can install them via pip install -r requirements.txt

Usage

You need an OpenAI Token to get it running

webcam: python yolo-chat-tts/main.py -ok <your key>
local video: python yolo-chat-tts/main.py -ok <your key> -vp "path/to/your/video.mov"
youtube: python yolo-chat-tts/main.py -ok <your key> -y "https://www.youtube.com/watch?v=uhkdUdXTUuc"

Args

See all arguments : python yolo-chat-tts/main.py --help

You can

choose between multiple camera devices
pick the interval between the cynical comments
choose whether the object detection is in your video or just in the logs
choose a threshold for confidence
choose a threshold for IoU
choose the model size

Thanks Tien Luong Ngoc & Ibai Gorordo, I took a bunch of useful code from your linked repositories

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
files/yolo-model		files/yolo-model
utils		utils
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Marvin the Paranoid Robot

Description

Dependencies

Python

GPU

Python packages

Usage

Args

About

Releases

Packages

Languages

causeri3/marvin-the-paranoid-android

Folders and files

Latest commit

History

Repository files navigation

Marvin the Paranoid Robot

Description

Dependencies

Python

GPU

Python packages

Usage

Args

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages