Hotword detection for the PinguimBots robot Theta. Hotword can also be called wakeword.
Porcupine stands out among the tested libraries below. It is fast in detecting the spoken wakeword.
- SnowBoy (da KITT.AI)
- does not exist anymore
- PocketSphinx da CMUSphinx
- not very accurate
- offline
- open source
- Pypi: https://pypi.org/project/pocketsphinx/
- Pvporcupine
- needs AcessKey
- only three models per month
- only 3 computers to use a model per month
- needs to train another ppn every month
- Two types:
- Porcupine wake word
- fast detection
- efficient
- Source: https://picovoice.ai/docs/quick-start/porcupine-go/
- Rhino Speech-to-Intent
- can only make 10 commands in free version
- Source: https://picovoice.ai/platform/rhino/
- EfficientWord-Net
- needs to train a model for specific words
- models for words like Alexa, siri, Mycrosoft
- Source code: https://github.com/Ant-Brain/EfficientWord-Net
- needs to train a model for specific words
- Mycroft Precise
- requires python 3.6 until 3.9
- not precise
$ pip install -r requirements.txt
It also requires the archive din-ding.mp3
and the ppn file from the picovoice console.
Please read Notes below before you run this code.
$ git clone https://github.com/pinguimbotsathome/Hotword
$ cd Hotword
$ python porcupine.py
It uses pvrecorder to open the mic and start listening to the audio stream. When it hears the wakeword "Hey Theta", it prints Detected
on the screen and plays the audio file din-ding.mp3
.
This instructions are crucial to run the script:
- Access key => in code:
key
=> key from picovoice account. Create an account in picovoice and copy the Access key. keyword_p
=> path to ppn file. Go to picovoice console and train the wakeword "Hey Theta". Choose Linux as device and download the file. Put the file inside the Hotword directory.sensitivities
=> It is a number between 0 and 1. Deafult is 0.55.A higher sensitivity results in fewer misses at the cost of increasing the false alarm rate.device_index
=> the default number is -1. Change it to match your device input. Usually is 1.