For SnapDoc AI's voice commands to work, you need to download a Vosk speech recognition model. This guide explains how to set it up.
-
Download a model from the Vosk website:
- Visit Vosk Models
- For most users, the smaller English model is sufficient: vosk-model-small-en-us-0.15 (~40MB)
- For better accuracy, you can download the larger model: vosk-model-en-us-0.22 (~1.8GB)
-
Create the model directory:
- Create a
.snapdoc-ai
folder in your home directory - Inside it, create a
voice-model
folder
# On Windows: mkdir %USERPROFILE%\.snapdoc-ai mkdir %USERPROFILE%\.snapdoc-ai\voice-model # On macOS/Linux: mkdir -p ~/.snapdoc-ai/voice-model
- Create a
-
Extract the model:
- Extract the downloaded ZIP file
- Copy the contents of the extracted folder (not the folder itself) into the
voice-model
folder
-
Test the voice recognition:
- Launch SnapDoc AI
- Click the "Voice Commands" toggle in the sidebar
- If everything is set up correctly, you should see "Voice: On" in the status bar
Once voice recognition is enabled, you can use the following commands:
- "Open document" - Opens the file selection dialog
- "Analyze document" - Analyzes the current document
- "Summarize document" - Generates a summary
- "Extract information" - Extracts key information
- "Read document" - Reads the current document aloud
- "Stop reading" - Stops the text-to-speech
- "Pause reading" - Pauses the text-to-speech
- "Resume reading" - Resumes reading
- "Switch to dark mode" - Changes to dark theme
- "Switch to light mode" - Changes to light theme
If voice features don't work:
-
Check installation:
- Make sure you've installed voice dependencies with
make install-voice
ormake install-full
- Make sure you've installed voice dependencies with
-
Verify model location:
- Ensure the voice model is in the correct location (
~/.snapdoc-ai/voice-model/
) - The directory should contain model files, not a subfolder
- Ensure the voice model is in the correct location (
-
Check microphone access:
- Make sure your application has permission to access the microphone
- On Windows, check Privacy & Security settings
- On macOS, check System Preferences > Security & Privacy > Microphone
-
Restart the application:
- Sometimes a simple restart resolves recognition issues
When packaging SnapDoc AI as a Windows app, the voice model should be included automatically if it's in the correct location when you run the packaging script. If you're distributing the application, consider including a script that automatically downloads and extracts the appropriate voice model during installation.