Skip to content

erturkkadir/hanutchu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hanutchu

AI Speech Interactive Web Application

An interactive web application that enables real-time speech recognition and AI-powered responses, creating a seamless voice-based interaction between users and AI.

🌟 Features

  • Real-time speech recognition using whisper.cpp
  • Natural language processing with AI model integration
  • Voice synthesis for AI responses (Kokoro/ F5 TTS)
  • Responsive web design for all devices
  • Low-latency communication
  • Simple and intuitive user interface

🚀 Quick Start

  1. Clone the repository:
git clone https://github.com/erturkkadir/hanutchu.git
cd hanutchu

🛠️ Technology Stack

  • Frontend: vanilla js
  • Speech Recognition: whisper.cpp
  • Voice Synthesis: vioce activity detection
  • AI Integration: ollama with any model
  • Styling: Bootstrap CSS
  • Development: Python. Websokcet/WebRTC,

🔧 System Requirements

  • Modern web browser with microphone and.or webcam
  • Active internet connection

📖 How It Works

  1. Speech Recognition: The application listens to user speech input using basic browser functions
  2. Text Processing: Converts speech to text and processes it for AI model consumption
  3. AI Processing: Sends processed text to AI model and receives response
  4. Voice Synthesis: Converts AI response to speech using S2T model (kokoro, F5)
  5. User Interface: Updates UI with transcription and response in real-time

🤝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Guidelines

  • Follow the existing code style and conventions
  • Write clear commit messages
  • Add tests for new features
  • Update documentation as needed

📝 License

This project is licensed under the Apacahe 2.0 License - see the LICENSE file for details.

🎯 Roadmap

  • Add support for multiple languages
  • Implement conversation history
  • Add voice customization options
  • Create offline mode capabilities
  • Improve error handling and recovery
  • Add unit and integration tests

🤔 Support

If you encounter any issues or have questions:

  1. Check the Issues page
  2. Create a new issue if your problem isn't already listed
  3. Join our Discord community

🙏 Acknowledgments

  • llama.cpp / whisper.cpp GGerganov for their amazing work
  • ollama team for their excellent API
  • All our contributors and supporters

Made with ❤️ by [KadirErturk]

About

ai speech to speech web app

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published