-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: STT lang detector #289
Conversation
allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory
WalkthroughThe changes introduce significant modifications to the Changes
Suggested reviewers
Poem
Warning Rate limit exceeded@JarbasAl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 13 minutes and 39 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
ovos_plugin_manager/templates/stt.py (2)
42-42
: Consider moving the import statement to the module level for clarityImporting modules inside functions is generally discouraged unless necessary to prevent cyclic dependencies or reduce startup time. Moving the import of
AudioData
to the top of the file improves code readability and follows best practices.Apply this diff to move the import statement:
+from speech_recognition import AudioData def detect_language(self, audio) -> str: - from speech_recognition import AudioData if not isinstance(audio, AudioData): raise TypeError("audio must be an instance of AudioData")
37-39
: Ensure type safety in thebind
methodWhile type hints are helpful, they are not enforced at runtime. Consider adding an explicit type check in the
bind
method to ensure that thedetector
parameter is an instance ofAudioLanguageDetector
. This will prevent potential runtime errors if an incorrect type is passed.Apply this diff to add type checking:
def bind(self, detector: AudioLanguageDetector): + if not isinstance(detector, AudioLanguageDetector): + raise TypeError("detector must be an instance of AudioLanguageDetector") self._detector = detector LOG.debug(f"{self.__class__.__name__} - Assigned lang detector: {detector}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
ovos_plugin_manager/templates/stt.py
(3 hunks)
🔇 Additional comments (1)
ovos_plugin_manager/templates/stt.py (1)
140-141
: Handle possible exceptions during language detection
When lang
is set to "auto"
, the detect_language
method is called, which may raise a NotImplementedError
if _detector
is not set. Ensure that this exception is properly handled to prevent unexpected crashes. Additionally, consider handling cases where language detection might fail or return None
.
Run the following script to verify that all uses of transcribe
handle exceptions properly:
This script searches for all instances where transcribe
is called and displays the following 5 lines for manual inspection. Ensure that appropriate try-except
blocks are in place where necessary.
allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (2)
ovos_plugin_manager/templates/stt.py (2)
37-39
: Add parameter validation in bind methodWhile the type hint is good, consider adding runtime validation to ensure the detector is properly initialized.
def bind(self, detector: AudioLanguageDetector): + if not isinstance(detector, AudioLanguageDetector): + raise TypeError("detector must be an instance of AudioLanguageDetector") self._detector = detector LOG.debug(f"{self.__class__.__name__} - Assigned lang detector: {detector}")
Line range hint
22-24
: Add documentation for language detection featureConsider adding class-level documentation to explain the language detection capability and how plugin developers can implement it.
class STT(metaclass=ABCMeta): - """ STT Base class, all STT backends derives from this one. """ + """ STT Base class, all STT backends derives from this one. + + Features: + - Basic STT functionality through the execute() method + - Optional language detection through bind(detector) and detect_language(audio) + - Automatic language detection when transcribe() is called with lang="auto" + + Plugin developers can implement language detection by: + 1. Binding a language detector using the bind() method + 2. Ensuring available_languages property returns supported languages + """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
ovos_plugin_manager/templates/stt.py
(3 hunks)
🔇 Additional comments (1)
ovos_plugin_manager/templates/stt.py (1)
15-15
: LGTM: New imports and attribute for language detection
The new imports and private attribute _detector
are well-structured and follow Python conventions.
Also applies to: 19-19, 35-35
allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
ovos_plugin_manager/templates/stt.py (3)
35-40
: Add docstring to the bind methodThe implementation is good, but adding a docstring would improve maintainability.
def bind(self, detector: AudioLanguageDetector): + """Bind a language detector to this STT instance. + + Args: + detector (AudioLanguageDetector): The language detector to use + """ self._detector = detector LOG.debug(f"{self.__class__.__name__} - Assigned lang detector: {detector}")
41-44
: Add return value documentationThe method would benefit from documenting the return tuple format.
def detect_language(self, audio, valid_langs: Optional[Union[Set[str], List[str]]] = None) -> Tuple[str, float]: + """Detect the language of the provided audio. + + Args: + audio: Audio data to analyze + valid_langs: Optional set/list of valid languages to consider + + Returns: + Tuple[str, float]: A tuple containing the detected language code and confidence score + + Raises: + NotImplementedError: If no language detector is bound + """ if self._detector is None: raise NotImplementedError(f"{self.__class__.__name__} does not support audio language detection") return self._detector.detect(audio, valid_langs=valid_langs or self.available_languages)
138-139
: Consider caching language detection resultsFor streaming use cases, consider caching the detected language to avoid redundant detection on subsequent transcriptions of the same audio stream.
+ _last_detected_lang: Optional[Tuple[str, float]] = None + def transcribe(self, audio, lang: Optional[str] = None) -> List[Tuple[str, float]]: if lang is not None and lang == "auto": - lang, prob = self.detect_language(audio, self.available_languages) + # Cache detection result for potential reuse + if not self._last_detected_lang: + self._last_detected_lang = self.detect_language(audio, self.available_languages) + lang, prob = self._last_detected_lang
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
ovos_plugin_manager/templates/stt.py
(3 hunks)
🔇 Additional comments (3)
ovos_plugin_manager/templates/stt.py (3)
11-11
: LGTM: Imports are well organized and necessary
The new imports support type hints and language detection functionality while maintaining clean dependency management.
Also applies to: 15-15, 19-19
143-143
: LGTM: Improved type safety
The more specific return type hint Set[str]
improves type safety and documentation.
Line range hint 241-244
: Verify StreamingSTT implementations
The updated signature needs to be properly handled by all implementations.
✅ Verification successful
StreamingSTT signature change appears safe
The updated transcribe
method signature in StreamingSTT
is an abstract base class (ABC) implementation. From the verification:
- No concrete implementations were found in the codebase
- The base class is marked with
metaclass=ABCMeta
, making it an interface - The signature change maintains compatibility by using
Optional
types
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all StreamingSTT implementations to verify signature compatibility
# Search for classes that inherit from StreamingSTT
ast-grep --pattern 'class $NAME($$$StreamingSTT$$$) {
$$$
}'
# Search for transcribe method implementations
ast-grep --pattern 'def transcribe(self, audio$$$) {
$$$
}'
Length of output: 128
Script:
#!/bin/bash
# Let's try a different approach using ripgrep to find StreamingSTT implementations
# Search for class definitions inheriting from StreamingSTT
rg "class.*\(.*StreamingSTT.*\)" -l
# Search for files containing StreamingSTT to analyze implementations
rg "StreamingSTT" -l
# Search for transcribe method definitions
rg "def transcribe.*\(" -B1 -A5
Length of output: 1404
Script:
#!/bin/bash
# Let's examine the StreamingSTT class definition and its base class
# Check StreamingSTT base class definition
rg "class StreamingSTT" -A 10 ovos_plugin_manager/templates/stt.py
# Check if there are any concrete implementations in test files
rg "class.*StreamingSTT" -A 10 test/unittests/test_stt.py
# Look for any other potential implementations
rg "class.*StreamingSTT" -A 10 ovos_plugin_manager/stt.py
Length of output: 480
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
ovos_plugin_manager/templates/stt.py (1)
Line range hint
251-254
: Consider adding language detection support to StreamingSTTThe StreamingSTT class's transcribe method should be updated to support language detection similar to the base STT class. This would provide consistent behavior across all STT implementations.
def transcribe(self, audio: Optional = None, lang: Optional[str] = None) -> List[Tuple[str, float]]: """transcribe audio data to a list of possible transcriptions and respective confidences""" + if lang is not None and lang == "auto": + try: + lang, prob = self.detect_language(audio, self.available_languages) + except Exception as e: + LOG.error(f"Language detection failed: {e}. Falling back to default language.") + lang = self.lang # Fall back to default language return [(self.execute(audio, lang), 1.0)]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
ovos_plugin_manager/templates/stt.py
(4 hunks)
🔇 Additional comments (3)
ovos_plugin_manager/templates/stt.py (3)
11-11
: LGTM: Import changes are appropriate
The new imports support type hints and language detection functionality.
Also applies to: 16-19
139-144
: LGTM: Robust language detection implementation
The implementation includes proper error handling with fallback to default language, and the error is appropriately logged.
148-148
: Consider backwards compatibility impact
The return type change from set
to Set[str]
is more precise but could affect existing plugins that don't use type checking. Consider adding a note in the changelog about this breaking change.
Let's check for implementations that might be affected:
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
* feat: lang detector Update requirements.txt feat/lang_detection_plugin * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
* feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
allows STT plugins to detect the language from audio
most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory
related: OpenVoiceOS/ovos-stt-http-server#19
Summary by CodeRabbit
STT
class.transcribe
method to utilize language detection when the language parameter is set to "auto."available_languages
property to reflect improved type hinting.