-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No Transcription Recording only with VAD #49
Comments
Hey @shahzaib78631, unfortunately the Android 12 and lower SpeechRecognition APIs don't allow us to use a custom microphone source for the recognition. I may be able to use a workaround to at least get the underlying recognized buffers, however even if I do implement this, Android 12 and lower don't support continuous speech recognition which is probably a requirement here if you're building a communications app. This library is probably not the best fit for VAD either if you're not interested in the voice transcripts (due to the tight integration with the underlying APIs). I'd rather consider looking in to the following libraries: Each of these libraries give you a way to access real time frame chunks which you can use to check if there's voice activity. Perhaps implementing a gain filter could be appropriate for your use cases. Otherwise you may want to use a model to process those frames. |
Hey @jamsch, Thank you for the detailed explanation and quick response. For now if possible and not a time consuming task could you provide the workaround to get the underlying recognized buffers ?. I appreciate your recommendations for alternative libraries and I'll definitely look into them. Thank You. |
Hey @shahzaib78631, I'm not exactly sure if it's going to even be worth implementing the audio capture workaround for Android 12 and lower due to it not supporting continuous recognition, i.e. you'll have to manually start speech recognition again each time it stops (which will need to happen at least 10 times per minute). I think for cases like VAD you shouldn't need such a resource intensive process as speech recognition and instead you'd want to opt for something that can process the audio frame (using one of the libraries I mentioned above) and then either applying a gain filter to it (which is generally straightforward), or sending it to an API, or using on-device model like Cobra: https://github.com/Picovoice/cobra |
Thank You bro |
Hello Everyone 🤗.
I have been working on an app where i need to implement VAD (Voice Activity Detection) wheen user starts speaking 🗣️ i want to record the audio and send it to the server. I found this package which is able to transcribe when user starts speaking but instead of transcript what i want is the audio only. I have also read about the recording property in the package but it supports only Android 13+. Is there any way to only get the audio with VAD on lower android versions too?
The text was updated successfully, but these errors were encountered: