feat: STT lang detector #289

JarbasAl · 2024-11-24T16:02:50Z

allows STT plugins to detect the language from audio

most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

related: OpenVoiceOS/ovos-stt-http-server#19

Summary by CodeRabbit

New Features
- Introduced automatic language detection for audio transcription.
- Added a method to bind a language detector to the speech-to-text functionality.
Bug Fixes
- Updated deprecation warnings for several STT classes to encourage subclassing from the new STT class.
Improvements
- Enhanced the transcribe method to utilize language detection when the language parameter is set to "auto."
- Updated the available_languages property to reflect improved type hinting.

allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

coderabbitai · 2024-11-24T16:02:57Z

Walkthrough

The changes introduce significant modifications to the STT class in the ovos_plugin_manager/templates/stt.py file. A new private attribute for language detection is added, along with methods for binding a language detector and detecting language from audio data. The transcribe method is updated to automatically detect the language if specified. Additionally, several classes are updated to indicate they should subclass directly from STT, and new dependencies are imported to support these functionalities.

Changes

File Path	Change Summary
ovos_plugin_manager/templates/stt.py	- Added private attribute `_detector` to `STT` class.
	- Introduced `bind` method for assigning a language detector.
	- Added `detect_language` method for language detection from audio data.
	- Updated `transcribe` method to detect language automatically if `lang` is "auto".
	- Updated `available_languages` property to return `Set[str]` instead of `set`.
	- Updated `TokenSTT`, `GoogleJsonSTT`, `BasicSTT`, and `KeySTT` classes to reflect deprecation warnings.
	- Imported `LOG` and `AudioLanguageDetector` to support new functionalities.

Suggested reviewers

goldyfruit
j1nx

Poem

In the land of code where rabbits play,
A new detector hops in, brightening the day.
With languages detected, our tasks now flow,
Transcribing with ease, watch our skills grow!
So here’s to the changes, let’s give a cheer,
For every new feature brings us more near! 🐇✨

Warning

Rate limit exceeded

@JarbasAl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 13 minutes and 39 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e280939 and b526f89.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)

ovos_plugin_manager/templates/stt.py (2)
42-42: Consider moving the import statement to the module level for clarity

Importing modules inside functions is generally discouraged unless necessary to prevent cyclic dependencies or reduce startup time. Moving the import of AudioData to the top of the file improves code readability and follows best practices.

Apply this diff to move the import statement:
+from speech_recognition import AudioData

 def detect_language(self, audio) -> str:
-    from speech_recognition import AudioData
     if not isinstance(audio, AudioData):
         raise TypeError("audio must be an instance of AudioData")
37-39: Ensure type safety in the bind method

While type hints are helpful, they are not enforced at runtime. Consider adding an explicit type check in the bind method to ensure that the detector parameter is an instance of AudioLanguageDetector. This will prevent potential runtime errors if an incorrect type is passed.

Apply this diff to add type checking:
 def bind(self, detector: AudioLanguageDetector):
+    if not isinstance(detector, AudioLanguageDetector):
+        raise TypeError("detector must be an instance of AudioLanguageDetector")
     self._detector = detector
     LOG.debug(f"{self.__class__.__name__} - Assigned lang detector: {detector}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ea81695 and 4290e19.

📒 Files selected for processing (1)

ovos_plugin_manager/templates/stt.py (3 hunks)

🔇 Additional comments (1)

ovos_plugin_manager/templates/stt.py (1)

140-141: Handle possible exceptions during language detection

When lang is set to "auto", the detect_language method is called, which may raise a NotImplementedError if _detector is not set. Ensure that this exception is properly handled to prevent unexpected crashes. Additionally, consider handling cases where language detection might fail or return None.

Run the following script to verify that all uses of transcribe handle exceptions properly:

This script searches for all instances where transcribe is called and displays the following 5 lines for manual inspection. Ensure that appropriate try-except blocks are in place where necessary.

ovos_plugin_manager/templates/stt.py

allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)

ovos_plugin_manager/templates/stt.py (2)

37-39: Add parameter validation in bind method

While the type hint is good, consider adding runtime validation to ensure the detector is properly initialized.

 def bind(self, detector: AudioLanguageDetector):
+    if not isinstance(detector, AudioLanguageDetector):
+        raise TypeError("detector must be an instance of AudioLanguageDetector")
     self._detector = detector
     LOG.debug(f"{self.__class__.__name__} - Assigned lang detector: {detector}")

Line range hint 22-24: Add documentation for language detection feature

Consider adding class-level documentation to explain the language detection capability and how plugin developers can implement it.

 class STT(metaclass=ABCMeta):
-    """ STT Base class, all  STT backends derives from this one. """
+    """ STT Base class, all STT backends derives from this one.
+    
+    Features:
+    - Basic STT functionality through the execute() method
+    - Optional language detection through bind(detector) and detect_language(audio)
+    - Automatic language detection when transcribe() is called with lang="auto"
+    
+    Plugin developers can implement language detection by:
+    1. Binding a language detector using the bind() method
+    2. Ensuring available_languages property returns supported languages
+    """

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4290e19 and 62eb083.

📒 Files selected for processing (1)

ovos_plugin_manager/templates/stt.py (3 hunks)

🔇 Additional comments (1)

ovos_plugin_manager/templates/stt.py (1)

15-15: LGTM: New imports and attribute for language detection

The new imports and private attribute _detector are well-structured and follow Python conventions.

Also applies to: 19-19, 35-35

ovos_plugin_manager/templates/stt.py

allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

ovos_plugin_manager/templates/stt.py (3)

35-40: Add docstring to the bind method

The implementation is good, but adding a docstring would improve maintainability.

 def bind(self, detector: AudioLanguageDetector):
+    """Bind a language detector to this STT instance.
+    
+    Args:
+        detector (AudioLanguageDetector): The language detector to use
+    """
     self._detector = detector
     LOG.debug(f"{self.__class__.__name__} - Assigned lang detector: {detector}")

41-44: Add return value documentation

The method would benefit from documenting the return tuple format.

 def detect_language(self, audio, valid_langs: Optional[Union[Set[str], List[str]]] = None) -> Tuple[str, float]:
+    """Detect the language of the provided audio.
+    
+    Args:
+        audio: Audio data to analyze
+        valid_langs: Optional set/list of valid languages to consider
+    
+    Returns:
+        Tuple[str, float]: A tuple containing the detected language code and confidence score
+    
+    Raises:
+        NotImplementedError: If no language detector is bound
+    """
     if self._detector is None:
         raise NotImplementedError(f"{self.__class__.__name__} does not support audio language detection")
     return self._detector.detect(audio, valid_langs=valid_langs or self.available_languages)

138-139: Consider caching language detection results

For streaming use cases, consider caching the detected language to avoid redundant detection on subsequent transcriptions of the same audio stream.

+    _last_detected_lang: Optional[Tuple[str, float]] = None
+
     def transcribe(self, audio, lang: Optional[str] = None) -> List[Tuple[str, float]]:
         if lang is not None and lang == "auto":
-            lang, prob = self.detect_language(audio, self.available_languages)
+            # Cache detection result for potential reuse
+            if not self._last_detected_lang:
+                self._last_detected_lang = self.detect_language(audio, self.available_languages)
+            lang, prob = self._last_detected_lang

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 62eb083 and e280939.

📒 Files selected for processing (1)

ovos_plugin_manager/templates/stt.py (3 hunks)

🔇 Additional comments (3)

ovos_plugin_manager/templates/stt.py (3)

11-11: LGTM: Imports are well organized and necessary

The new imports support type hints and language detection functionality while maintaining clean dependency management.

Also applies to: 15-15, 19-19

143-143: LGTM: Improved type safety

The more specific return type hint Set[str] improves type safety and documentation.

Line range hint 241-244: Verify StreamingSTT implementations

The updated signature needs to be properly handled by all implementations.

✅ Verification successful

StreamingSTT signature change appears safe

The updated transcribe method signature in StreamingSTT is an abstract base class (ABC) implementation. From the verification:

No concrete implementations were found in the codebase
The base class is marked with metaclass=ABCMeta, making it an interface
The signature change maintains compatibility by using Optional types

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all StreamingSTT implementations to verify signature compatibility

# Search for classes that inherit from StreamingSTT
ast-grep --pattern 'class $NAME($$$StreamingSTT$$$) {
  $$$
}'

# Search for transcribe method implementations
ast-grep --pattern 'def transcribe(self, audio$$$) {
  $$$
}'

Length of output: 128

Script:

#!/bin/bash
# Let's try a different approach using ripgrep to find StreamingSTT implementations

# Search for class definitions inheriting from StreamingSTT
rg "class.*\(.*StreamingSTT.*\)" -l

# Search for files containing StreamingSTT to analyze implementations
rg "StreamingSTT" -l

# Search for transcribe method definitions
rg "def transcribe.*\(" -B1 -A5

Length of output: 1404

Script:

#!/bin/bash
# Let's examine the StreamingSTT class definition and its base class

# Check StreamingSTT base class definition
rg "class StreamingSTT" -A 10 ovos_plugin_manager/templates/stt.py

# Check if there are any concrete implementations in test files
rg "class.*StreamingSTT" -A 10 test/unittests/test_stt.py

# Look for any other potential implementations
rg "class.*StreamingSTT" -A 10 ovos_plugin_manager/stt.py

Length of output: 480

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)

ovos_plugin_manager/templates/stt.py (1)

Line range hint 251-254: Consider adding language detection support to StreamingSTT

The StreamingSTT class's transcribe method should be updated to support language detection similar to the base STT class. This would provide consistent behavior across all STT implementations.

 def transcribe(self, audio: Optional = None,
                lang: Optional[str] = None) -> List[Tuple[str, float]]:
     """transcribe audio data to a list of
     possible transcriptions and respective confidences"""
+    if lang is not None and lang == "auto":
+        try:
+            lang, prob = self.detect_language(audio, self.available_languages)
+        except Exception as e:
+            LOG.error(f"Language detection failed: {e}. Falling back to default language.")
+            lang = self.lang  # Fall back to default language
     return [(self.execute(audio, lang), 1.0)]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e280939 and b526f89.

📒 Files selected for processing (1)

ovos_plugin_manager/templates/stt.py (4 hunks)

🔇 Additional comments (3)

ovos_plugin_manager/templates/stt.py (3)

11-11: LGTM: Import changes are appropriate

The new imports support type hints and language detection functionality.

Also applies to: 16-19

139-144: LGTM: Robust language detection implementation

The implementation includes proper error handling with fallback to default language, and the error is appropriately logged.

148-148: Consider backwards compatibility impact

The return type change from set to Set[str] is more precise but could affect existing plugins that don't use type checking. Consider adding a note in the changelog about this breaking change.

Let's check for implementations that might be affected:

ovos_plugin_manager/templates/stt.py

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

* feat: lang detector Update requirements.txt feat/lang_detection_plugin * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

* feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models * feat: STT lang detector companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

feat: STT lang detector

4290e19

allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

github-actions bot added feature and removed feature labels Nov 24, 2024

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

ovos_plugin_manager/templates/stt.py Outdated Show resolved Hide resolved

feat: STT lang detector

62eb083

allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

github-actions bot added feature and removed feature labels Nov 24, 2024

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

ovos_plugin_manager/templates/stt.py Outdated Show resolved Hide resolved

ovos_plugin_manager/templates/stt.py Outdated Show resolved Hide resolved

feat: STT lang detector

e280939

allows STT plugins to detect the language from audio most plugins don't support this, but some like whisper do and would otherwise require loading 2 models into memory

github-actions bot added feature and removed feature labels Nov 24, 2024

JarbasAl added a commit to OpenVoiceOS/ovos-stt-plugin-fasterwhisper that referenced this pull request Nov 24, 2024

feat: STT lang detector

ca3b4c5

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

,

b526f89

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

ovos_plugin_manager/templates/stt.py Show resolved Hide resolved

JarbasAl added a commit to OpenVoiceOS/ovos-stt-http-server that referenced this pull request Nov 24, 2024

feat: STT lang detector

7051005

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

JarbasAl merged commit 5b84b21 into dev Nov 24, 2024
8 checks passed

JarbasAl deleted the feat/stt_lang_detect branch November 24, 2024 16:30

JarbasAl mentioned this pull request Nov 24, 2024

feat: STT lang detector OpenVoiceOS/ovos-stt-plugin-fasterwhisper#18

Merged

JarbasAl added a commit to OpenVoiceOS/ovos-stt-plugin-fasterwhisper that referenced this pull request Nov 24, 2024

feat: STT lang detector

43a84f9

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

JarbasAl added a commit to OpenVoiceOS/ovos-stt-http-server that referenced this pull request Nov 24, 2024

feat: STT lang detector

3d7c695

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

JarbasAl added a commit to OpenVoiceOS/ovos-stt-plugin-fasterwhisper that referenced this pull request Nov 24, 2024

feat: STT lang detector

f4f7028

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

JarbasAl added a commit to OpenVoiceOS/ovos-stt-http-server that referenced this pull request Nov 24, 2024

feat: STT lang detector

6314ddc

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

JarbasAl added a commit to OpenVoiceOS/ovos-stt-plugin-fasterwhisper that referenced this pull request Nov 24, 2024

feat: STT lang detector

abc026d

companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: STT lang detector #289

feat: STT lang detector #289

JarbasAl commented Nov 24, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 24, 2024 •

edited

Loading

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

feat: STT lang detector #289

feat: STT lang detector #289

Conversation

JarbasAl commented Nov 24, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Nov 24, 2024 • edited Loading

Walkthrough

Changes

Suggested reviewers

Poem

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

JarbasAl commented Nov 24, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 24, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)