Add language selection through config with whisper + Improve tests #48

AudranBert · 2024-11-22T15:34:43Z

Add language selection for streaming with whisper, by default it will take the language found in the env settings. But you can pass a language in the config when starting streaming.

It also adds the possibility to pass a language in the config in case of offline decoding as requested in #53 . It will enable having a same model instance used for multiple languages instead of launching another Docker.

The PR is also improving tests to add tests about languages. Also removing some useless ones in order to reduce testing duration.

Merge next -> master

Signed-off-by: AudranBert <[email protected]>

whisper/stt/processing/streaming.py

Signed-off-by: AudranBert <[email protected]>

damienlaine · 2024-11-28T17:52:13Z

Could you clarify the list of supported languages? For example, does it include "en," "fr," etc.? On the LinTO side, we consistently use BCP-47 codes for language representation.
Parsers (env, API directives...) shall at least support BCP-47 codes as inputs.

Jeronymous · 2024-11-29T06:05:36Z

Could you clarify the list of supported languages? For example, does it include "en," "fr," etc.? On the LinTO side, we consistently use BCP-47 codes for language representation. Parsers (env, API directives...) shall at least support BCP-47 codes as inputs.

That did not changes in this PR.
several formats are supported : "fr" and "fr-FR". This holds for the whole LinTO speech toolkit.

Supported languages are listed here : https://github.com/linto-ai/linto-stt/blob/master/whisper/README.md#language

Also if the user gives a wrong one, it will give an explicit message with the list of possible ones (in the format "fr").

Why this question ? Do you think something is missing in the code or the documentation ?

damienlaine · 2024-11-29T11:09:31Z

I haven’t reviewed the code and relied on the doc:

The docs mention "two or three-letter codes" for languages but not BCP-47 tags—should this be clarified?
The PR focuses on streaming (?), but what about Celery (task) and HTTP service modes? Are specification updates planned for these?
For Celery, should we open an issue in https://github.com/linto-ai/linto-transcription to handle the target language correctly?

AudranBert · 2024-11-29T12:41:14Z

The PR focuses on streaming (?), but what about Celery (task) and HTTP service modes? Are specification updates planned for these?

The PR was created to fix the selection language in streaming, but I added the possibility to send the language through the config for streaming and offline (http and task). That's why I linked this PR to the issue #53

Signed-off-by: AudranBert <[email protected]>

AudranBert · 2024-11-29T14:22:29Z

The docs mention "two or three-letter codes" for languages but not BCP-47 tags—should this be clarified?

It should work with tags like "fr-FR" because it will split on the "-" and keep the first part (here "fr") and use that as language.

Jeronymous · 2024-11-29T14:45:11Z

The docs mention "two or three-letter codes" for languages but not BCP-47 tags—should this be clarified?

Yes we should mention that they are supported, but that the second part ("FR" in "fr-FR") is ignored (results of the model are invariant to this)

The PR focuses on streaming (?), but what about Celery (task) and HTTP service modes? Are specification updates planned for these?

Yes. The PR is not finished yet ("WIP" in the title)

For Celery, should we open an issue in https://github.com/linto-ai/linto-transcription to handle the target language correctly?

Yes. There will be an issue with that feature request.
Worst case I will make it when I will commit related things (mentioning the issue in the commit message : we discussed to use this as much as possible).
(our plan is to split the work : Audran here on core stt / me on transcription service API evolution)

Signed-off-by: AudranBert <[email protected]>

whisper/stt/processing/utils.py

Signed-off-by: AudranBert <[email protected]>

AudranBert · 2024-12-02T14:34:49Z

Tests are running, I don't know how much time it will take to finish

Signed-off-by: AudranBert <[email protected]>

Jeronymous and others added 5 commits April 22, 2024 08:57

Merge remote-tracking branch 'origin/next'

e0b6fea

Merge pull request #45 from linto-ai/next

bb05ae6

Merge next -> master

add: language option for streaming (can be in env or in config)

bbe04b2

Signed-off-by: AudranBert <[email protected]>

add language as option in test_streaming

f3b4680

Signed-off-by: AudranBert <[email protected]>

reduce amount of tests

51c49ee

Signed-off-by: AudranBert <[email protected]>

Jeronymous reviewed Nov 27, 2024

View reviewed changes

whisper/stt/processing/streaming.py Outdated Show resolved Hide resolved

Jeronymous and others added 2 commits November 27, 2024 18:23

Generalize the function, to format languages in general

c56ad3a

add language option in transcription config offline

eafa601

Signed-off-by: AudranBert <[email protected]>

AudranBert linked an issue Nov 28, 2024 that may be closed by this pull request

Add language selection for offline transcription with whisper models #53

Open

AudranBert added 4 commits November 28, 2024 11:50

update doc

ab501f8

Signed-off-by: AudranBert <[email protected]>

add language through config for celery

3f80d46

Signed-off-by: AudranBert <[email protected]>

fix doc whisper

d2a19cd

Signed-off-by: AudranBert <[email protected]>

refactor tests + add tests for languages

fb68f11

Signed-off-by: AudranBert <[email protected]>

AudranBert changed the title ~~[WIP] Add language selection for streaming with whisper + Improve tests~~ [WIP] Add language selection with whisper + Improve tests Nov 29, 2024

AudranBert changed the title ~~[WIP] Add language selection with whisper + Improve tests~~ [WIP] Add language selection through config with whisper + Improve tests Nov 29, 2024

add auditok version constrait

9cebdf7

Signed-off-by: AudranBert <[email protected]>

AudranBert added 2 commits November 29, 2024 16:06

improve error wrong language + better support for language codes

6478e52

Signed-off-by: AudranBert <[email protected]>

improve whisper readme part about languages

ed63330

Signed-off-by: AudranBert <[email protected]>

Jeronymous reviewed Nov 29, 2024

View reviewed changes

whisper/stt/processing/utils.py Outdated Show resolved Hide resolved

Jeronymous reviewed Nov 29, 2024

View reviewed changes

whisper/stt/processing/utils.py Outdated Show resolved Hide resolved

Jeronymous reviewed Nov 29, 2024

View reviewed changes

whisper/stt/processing/utils.py Show resolved Hide resolved

Merge branch 'next' into whisper_language

e898d46

Jeronymous mentioned this pull request Nov 30, 2024

Add language input option / detected language in output linto-ai/linto-transcription#22

Open

upd doc for tests

4293010

Signed-off-by: AudranBert <[email protected]>

AudranBert added 2 commits December 2, 2024 10:33

upd whisper test + add more languages tests

c00bfa2

Signed-off-by: AudranBert <[email protected]>

rm useless lines

59a4d83

Signed-off-by: AudranBert <[email protected]>

AudranBert changed the title ~~[WIP] Add language selection through config with whisper + Improve tests~~ Add language selection through config with whisper + Improve tests Dec 2, 2024

Jeronymous changed the base branch from master to next December 2, 2024 14:31

Update release notes

98dfcb9

Jeronymous force-pushed the whisper_language branch from d20dc2f to 647d119 Compare December 2, 2024 14:58

Jeronymous approved these changes Dec 2, 2024

View reviewed changes

Improve doc about languages

b43430f

Jeronymous force-pushed the whisper_language branch from 647d119 to b43430f Compare December 2, 2024 15:07

remove 4 tests

edefded

Signed-off-by: AudranBert <[email protected]>

Jeronymous merged commit ca1a839 into next Dec 2, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add language selection through config with whisper + Improve tests #48

Add language selection through config with whisper + Improve tests #48

AudranBert commented Nov 22, 2024 •

edited

Loading

damienlaine commented Nov 28, 2024

Jeronymous commented Nov 29, 2024

damienlaine commented Nov 29, 2024

AudranBert commented Nov 29, 2024 •

edited

Loading

AudranBert commented Nov 29, 2024

Jeronymous commented Nov 29, 2024 •

edited

Loading

AudranBert commented Dec 2, 2024

Add language selection through config with whisper + Improve tests #48

Add language selection through config with whisper + Improve tests #48

Conversation

AudranBert commented Nov 22, 2024 • edited Loading

damienlaine commented Nov 28, 2024

Jeronymous commented Nov 29, 2024

damienlaine commented Nov 29, 2024

AudranBert commented Nov 29, 2024 • edited Loading

AudranBert commented Nov 29, 2024

Jeronymous commented Nov 29, 2024 • edited Loading

AudranBert commented Dec 2, 2024

AudranBert commented Nov 22, 2024 •

edited

Loading

AudranBert commented Nov 29, 2024 •

edited

Loading

Jeronymous commented Nov 29, 2024 •

edited

Loading