-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add headers for multiple language identification #99
Merged
+191
−25
Merged
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
ce50006
add headers for multiple language identification
mbatchkarov 63ffee5
add headers for language identification
mbatchkarov 7736dc4
Fix typo in header name
mbatchkarov 107357e
address CR comments and add language_code in response
mbatchkarov ee532c6
add support for CV when using Multi LID
mbatchkarov cde728b
remove python 3.7 from test matrix
mbatchkarov 6d41e7e
remove all uses of py 3.7 and replace with 3.11
mbatchkarov 43c434a
update codecov action
mbatchkarov 890b91d
fix black warning
mbatchkarov f6220b6
update deprecated ubuntu used in linter
mbatchkarov 2506bd4
exclude tests from coverage
mbatchkarov 7d5ab0b
fix flake8 warning
mbatchkarov b40d3b8
add support for vocabulary filters
mbatchkarov f4fd86d
fix bad merge :(
mbatchkarov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
coverage: | ||
ignore: | ||
- "tests/*" | ||
status: | ||
patch: | ||
default: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,13 +8,13 @@ on: | |
|
||
jobs: | ||
lint: | ||
runs-on: ubuntu-20.04 | ||
runs-on: ubuntu-24.04 | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.9 | ||
- name: Set up Python 3.11 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.9 | ||
python-version: 3.11 | ||
- name: Run pre-commit | ||
uses: pre-commit/[email protected] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -158,20 +158,23 @@ def __init__( | |
is_partial: Optional[bool] = None, | ||
alternatives: Optional[List[Alternative]] = None, | ||
channel_id: Optional[str] = None, | ||
language_code: Optional[str] = None, | ||
): | ||
self.result_id = result_id | ||
self.start_time = start_time | ||
self.end_time = end_time | ||
self.is_partial = is_partial | ||
self.alternatives = alternatives | ||
self.channel_id = channel_id | ||
self.language_code = language_code | ||
|
||
|
||
class StartStreamTranscriptionRequest: | ||
"""Transcription Request | ||
|
||
:param language_code: | ||
Indicates the source language used in the input audio stream. | ||
Indicates the source language used in the input audio stream. Set to | ||
None if identify_multiple_languages is set to True | ||
|
||
:param media_sample_rate_hz: | ||
The sample rate, in Hertz, of the input audio. We suggest that you | ||
|
@@ -226,6 +229,15 @@ class StartStreamTranscriptionRequest: | |
overall transcription accuracy. | ||
:param language_model_name: | ||
The name of the language model you want to use. | ||
:param identify_multiple_languages: | ||
If true, all languages spoken in the stream are identified. A multilingual | ||
transcripts is created your transcript using each identified language. | ||
You must also provide at least two language_options and set | ||
language_code to None | ||
:param language_options: | ||
A list of possible language to use when identify_multiple_languages is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should mention identify_language as well |
||
set to True. Note that not all languages supported by Transcribe are | ||
supported for multiple language identification | ||
""" | ||
|
||
def __init__( | ||
|
@@ -234,24 +246,32 @@ def __init__( | |
media_sample_rate_hz=None, | ||
media_encoding=None, | ||
vocabulary_name=None, | ||
vocabulary_names=None, | ||
session_id=None, | ||
vocab_filter_method=None, | ||
vocab_filter_name=None, | ||
vocab_filter_names=None, | ||
show_speaker_label=None, | ||
enable_channel_identification=None, | ||
number_of_channels=None, | ||
enable_partial_results_stabilization=None, | ||
partial_results_stability=None, | ||
language_model_name=None, | ||
identify_language=None, | ||
preferred_language=None, | ||
identify_multiple_languages=False, | ||
language_options=None, | ||
): | ||
|
||
self.language_code: Optional[str] = language_code | ||
self.media_sample_rate_hz: Optional[int] = media_sample_rate_hz | ||
self.media_encoding: Optional[str] = media_encoding | ||
self.vocabulary_name: Optional[str] = vocabulary_name | ||
self.vocabulary_names: Optional[List[str]] = vocabulary_names | ||
self.session_id: Optional[str] = session_id | ||
self.vocab_filter_method: Optional[str] = vocab_filter_method | ||
self.vocab_filter_name: Optional[str] = vocab_filter_name | ||
self.vocab_filter_names: Optional[List[str]] = vocab_filter_names | ||
self.show_speaker_label: Optional[bool] = show_speaker_label | ||
self.enable_channel_identification: Optional[ | ||
bool | ||
|
@@ -262,6 +282,10 @@ def __init__( | |
] = enable_partial_results_stabilization | ||
self.partial_results_stability: Optional[str] = partial_results_stability | ||
self.language_model_name: Optional[str] = language_model_name | ||
self.identify_language: Optional[bool] = identify_language | ||
self.preferred_language: Optional[str] = preferred_language | ||
self.identify_multiple_languages: Optional[bool] = identify_multiple_languages | ||
self.language_options: Optional[List[str]] = language_options or [] | ||
|
||
|
||
class StartStreamTranscriptionResponse: | ||
|
@@ -324,6 +348,7 @@ def __init__( | |
media_sample_rate_hz=None, | ||
media_encoding=None, | ||
vocabulary_name=None, | ||
vocabulary_names=None, | ||
session_id=None, | ||
vocab_filter_name=None, | ||
vocab_filter_method=None, | ||
|
@@ -339,6 +364,7 @@ def __init__( | |
self.media_sample_rate_hz: Optional[int] = media_sample_rate_hz | ||
self.media_encoding: Optional[str] = media_encoding | ||
self.vocabulary_name: Optional[str] = vocabulary_name | ||
self.vocabulary_names: Optional[List[str]] = vocabulary_names | ||
self.session_id: Optional[str] = session_id | ||
self.transcript_result_stream: TranscriptResultStream = transcript_result_stream | ||
self.vocab_filter_name: Optional[str] = vocab_filter_name | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add
VocabularyFilterNames