Skip to content

Latest commit

 

History

History
74 lines (51 loc) · 2.85 KB

voice_model_setup.md

File metadata and controls

74 lines (51 loc) · 2.85 KB

Voice Recognition Setup for SnapDoc AI

For SnapDoc AI's voice commands to work, you need to download a Vosk speech recognition model. This guide explains how to set it up.

Vosk Model Setup

  1. Download a model from the Vosk website:

  2. Create the model directory:

    • Create a .snapdoc-ai folder in your home directory
    • Inside it, create a voice-model folder
    # On Windows:
    mkdir %USERPROFILE%\.snapdoc-ai
    mkdir %USERPROFILE%\.snapdoc-ai\voice-model
    
    # On macOS/Linux:
    mkdir -p ~/.snapdoc-ai/voice-model
    
  3. Extract the model:

    • Extract the downloaded ZIP file
    • Copy the contents of the extracted folder (not the folder itself) into the voice-model folder
  4. Test the voice recognition:

    • Launch SnapDoc AI
    • Click the "Voice Commands" toggle in the sidebar
    • If everything is set up correctly, you should see "Voice: On" in the status bar

Voice Commands

Once voice recognition is enabled, you can use the following commands:

  • "Open document" - Opens the file selection dialog
  • "Analyze document" - Analyzes the current document
  • "Summarize document" - Generates a summary
  • "Extract information" - Extracts key information
  • "Read document" - Reads the current document aloud
  • "Stop reading" - Stops the text-to-speech
  • "Pause reading" - Pauses the text-to-speech
  • "Resume reading" - Resumes reading
  • "Switch to dark mode" - Changes to dark theme
  • "Switch to light mode" - Changes to light theme

Troubleshooting

If voice features don't work:

  1. Check installation:

    • Make sure you've installed voice dependencies with make install-voice or make install-full
  2. Verify model location:

    • Ensure the voice model is in the correct location (~/.snapdoc-ai/voice-model/)
    • The directory should contain model files, not a subfolder
  3. Check microphone access:

    • Make sure your application has permission to access the microphone
    • On Windows, check Privacy & Security settings
    • On macOS, check System Preferences > Security & Privacy > Microphone
  4. Restart the application:

    • Sometimes a simple restart resolves recognition issues

Packaging with Voice Features

When packaging SnapDoc AI as a Windows app, the voice model should be included automatically if it's in the correct location when you run the packaging script. If you're distributing the application, consider including a script that automatically downloads and extracts the appropriate voice model during installation.