Issue with --batched: Sentences Are Displayed All at Once Instead of One by One #1235

izyspania · 2025-01-30T11:06:18Z

I’m encountering an issue when using the --batched option in faster-whisper. Without --batched, the transcription processes one sentence at a time, displaying them sequentially, which works well for my use case. However, when I enable --batched, many sentences are displayed in a single segment, making the output harder to follow and less readable.

Additionally, when I don’t use --batched, my RAM usage spikes to full capacity during the file loading phase , i have 32G of RAM, making my computer unresponsive for a few minutes. However, when I enable --batched, it works smoothly, though I still experience the issue with the output being harder to read. This issue doesn’t happen when using normal Whisper.

Thanks

MahmoudAshraf97 · 2025-02-03T10:02:56Z

@Purfview

Purfview · 2025-02-03T10:27:14Z

There's the right place for your issues: https://github.com/Purfview/whisper-standalone-win

I’m encountering an issue when using the --batched option in faster-whisper. Without --batched, the transcription processes one sentence at a time, displaying them sequentially, which works well for my use case. However, when I enable --batched, many sentences are displayed in a single segment, making the output harder to follow and less readable.

I noticed the same behaviour, but what's displayed doesn't matter. Look at the subtitles. Use --sentence if you want output to be split to sentences, you can try --standard too.

Additionally, when I don’t use --batched, my RAM usage spikes to full capacity during the file loading phase , i have 32G of RAM, making my computer unresponsive for a few minutes. However, when I enable --batched, it works smoothly, though I still experience the issue with the output being harder to read. This issue doesn’t happen when using normal Whisper.

What's the version, the command, and the file?

izyspania · 2025-02-03T11:00:19Z

Hi,

faster-whisper file.mp3 --language Romanian --device cuda --model large-v3-turbo --output_format srt --print_progress , this is what im using now but tried a few options combined with --batched including --sentence and different models and the results are the same. (i have not tried --standard yet)

Example.
EDIT: the audio (a cut): https://jumpshare.com/s/iIVzYbygkLa2PsueAKEH
Full file has 570 MB, it fills all the RAM when using without --batched and it starts to write on the disc for 20 secs but when it starts processing the RAM goes back to normal, it doesnt seem to happen with original whisper.

whisper (original) output:
1
00:00:00,000 --> 00:00:02,880
William Deal, 27

2
00:00:02,880 --> 00:00:04,840
Pista 2

3
00:00:04,840 --> 00:00:07,040
28

4
00:00:07,040 --> 00:00:15,700
Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite.

5
00:00:16,440 --> 00:00:19,140
Era o șmecherie pe care o învățase de la Avrum.

6
00:00:20,120 --> 00:00:28,780
Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se srecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată.

faster-whisper (without --batched): most of the times like original whisper but some times acts exactly like when im using --batched (in this example it acted like with --batched)

faster-whisper (with --batched):
00:00:00,000 --> 00:00:28,780
William Deal, 27. Pista 2. 28. Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite. Era o șmecherie pe care o învățase de la Avrum. Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se srecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată.

Purfview · 2025-02-03T12:06:41Z

tried a few options combined with --batched including --sentence and different models and the results are the same.

It's not possible that "the results are the same".

Full file has 570 MB

How long it's in time?

most of the times like original whisper but some times acts exactly like when im using --batched

That's normal, expected behaviour. Vanilla Whisper can output same too.

izyspania · 2025-02-03T12:34:23Z

That's is normal, expected behaviour. Vanilla Whisper can output same too.

True, but 90% of the time Vanilla Whisper outputs as i showed you.

Its about 10 hours long.

Anything i can do to make it output with --batched like how Vanilla Whisper does ? --sentence doesnt seem to do anything as far as i tested , you can test the file i linked.

I guess i need more RAM to process files this long? With --batched works great and 3x faster but i dont like the output as i showed you.

Purfview · 2025-02-03T12:42:09Z

Its about 10 hours long.

Then it's this issue -> #1234

Anything i can do to make it output with --batched like how Vanilla Whisper does ? --sentence doesnt seem to do anything as far as i tested , you can test the file i linked.

Already wrote to you what to do.
Share the srt you got with --batched --sentence.

izyspania · 2025-02-03T12:59:59Z

I just shared the output there form test.mp3

--batched --sentence (and with --standard and without --batched for this file is the same)

[00:00.000 --> 00:28.780] William Deal, 27. Pista 2. 28. Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite. Era o șmecherie pe care o învățase de la Avrum. Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se srecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată.

Transcription speed: 21.36 audio seconds/s

This is from vanilla:

[00:00.000 --> 00:02.880] William Deal, 27
[00:02.880 --> 00:04.840] Pista 2
[00:04.840 --> 00:07.040] 28
[00:07.040 --> 00:15.700] Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite.
[00:16.440 --> 00:19.120] Era o șmecherie pe care o învățase de la Avrum.
[00:20.120 --> 00:28.780] Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se strecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată.

Edit: more from the original file

1
00:00:00,000 --> 00:00:28,780
William Deal, 27. Pista 2. 28. Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite. Era o șmecherie pe care o învățase de la Avrum. Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se srecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată.

2
00:00:28,780 --> 00:00:40,540
Așa cum procedase și când împărțea broșuri și ziarul conștiința Berlinului, ea nu controla să vadă dacă o urmărea cineva, ci prespunea că cineva chiar făcea acest lucru.

3
00:00:41,160 --> 00:00:51,720
În cele din urmă o apucă pe bulevardul Nei, care o colește pe la sud perimetrul orașului, o luă spre Arcul de Triunf, apoi se îndreptă spre Montparnasse.

4
00:00:52,480 --> 00:00:56,560
Mai merse puțin până ajunse la o cafenea de pe strada Long Camps.

5
00:00:56,560 --> 00:01:04,600
Cumpără ziarul de dimineață și se așeză la o masă în fundul sării, de unde putea să observe ușa și ceru o cafea.

And from vanilla:

1
00:00:00,000 --> 00:00:02,880
William Deal, 27

2
00:00:02,880 --> 00:00:04,840
Pista 2

3
00:00:04,840 --> 00:00:07,040
28

4
00:00:07,040 --> 00:00:15,700
Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite.

5
00:00:16,440 --> 00:00:19,140
Era o șmecherie pe care o învățase de la Avrum.

6
00:00:20,120 --> 00:00:28,780
Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se srecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată.

7
00:00:28,780 --> 00:00:40,560
Așa cum procedase și când împărțea broșuri și ziarul conștiința Berlinului, ea nu controla să vadă dacă o urmărea cineva, ci prespunea că cineva chiar făcea acest lucru.

8
00:00:40,920 --> 00:00:51,740
În cele din urmă o apucă pe bulevardul Nei, care o colește pe la sud perimetrul orașului, o luă spre Arcul de Triunf, apoi se îndreptă spre Montparnasse.

Some more diff:

Vanilla:
28
00:03:19,140 --> 00:03:26,000
Aceste trupe de asalt au fost huliganii care au distrus magazine, au bătut, au asasinat oameni nevinovați

29
00:03:26,000 --> 00:03:35,100
și au fost promotorii antisemitismului lui Hitler, unul dintre principiile partidului nazis și ale celui de-al treilea rai.

Faster:
30
00:03:19,160 --> 00:03:35,140
Aceste trupe de asalt au fost huliganii care au distrus magazine, au bătut, au asasinat oameni nevinovați și au fost promotorii antisemitismului lui Hitler, unul dintre principiile partidului nazis și ale celui de-al treilea rai.

Purfview · 2025-02-03T13:08:45Z

Share the srt file, not the copy/pastes.

This is from vanilla:

I don't need it.

izyspania · 2025-02-03T13:09:38Z

The one from the 10 hours file or from the cut (test.mp3?)

Purfview · 2025-02-03T13:10:46Z

From any file, where you used --batched --sentence.

izyspania · 2025-02-03T13:21:59Z

test2.mp3 --language Romanian --device cuda --model large-v3-turbo --output_format srt --batched --sentence

Hmm , i did a small cut and tried again , the output looks OK now, ill do it again for the full 10 hour file and ill tell you the result , maybe i only looked on the output when using --sentence instead of checking the file (will take like 20 min with batched i think)
Edit: The things i pasted where from files not from the output screen.

I had to rename the file to .txt

test2.txt

Purfview · 2025-02-03T13:25:12Z

Hmm , i did a small cut and tried again , the output looks OK now, ill do it again for the full 10 hour file...

No need, I told you that it's impossible.
Probably you just mixed up some files.

izyspania · 2025-02-03T13:33:54Z

Im still going to try it now to be sure , if its working with --batched --sentence it fixes both of my problems. Thank you, sorry for the post i thought its a bug for sure.

Edit: Seems that it works with --batched --sentence , shows as vanilla in the file (not in the output screen) , you saved me a lot of time, thanks again.

I will ask you another quick question if you dont mind , is large-v3-turbo better than medium? It seems faster not sure if more accurate.

Purfview · 2025-02-03T13:41:01Z

I will ask you another quick question if you dont mind , is large-v3-turbo better than medium?

For the transcriptions turbo should be better.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with --batched: Sentences Are Displayed All at Once Instead of One by One #1235

Issue with --batched: Sentences Are Displayed All at Once Instead of One by One #1235

izyspania commented Jan 30, 2025 •

edited

Loading

MahmoudAshraf97 commented Feb 3, 2025

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025

Purfview commented Feb 3, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025 •

edited

Loading

Issue with --batched: Sentences Are Displayed All at Once Instead of One by One #1235

Issue with --batched: Sentences Are Displayed All at Once Instead of One by One #1235

Comments

izyspania commented Jan 30, 2025 • edited Loading

MahmoudAshraf97 commented Feb 3, 2025

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025 • edited Loading

Purfview commented Feb 3, 2025 • edited Loading

izyspania commented Feb 3, 2025 • edited Loading

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025 • edited Loading

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025

Purfview commented Feb 3, 2025 • edited Loading

izyspania commented Feb 3, 2025 • edited Loading

Purfview commented Feb 3, 2025

izyspania commented Feb 3, 2025 • edited Loading

Purfview commented Feb 3, 2025 • edited Loading

izyspania commented Jan 30, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

izyspania commented Feb 3, 2025 •

edited

Loading

Purfview commented Feb 3, 2025 •

edited

Loading