-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with --batched: Sentences Are Displayed All at Once Instead of One by One #1235
Comments
There's the right place for your issues: https://github.com/Purfview/whisper-standalone-win
I noticed the same behaviour, but what's displayed doesn't matter. Look at the subtitles. Use --sentence if you want output to be split to sentences, you can try --standard too.
What's the version, the command, and the file? |
Hi, faster-whisper file.mp3 --language Romanian --device cuda --model large-v3-turbo --output_format srt --print_progress , this is what im using now but tried a few options combined with --batched including --sentence and different models and the results are the same. (i have not tried --standard yet) Example. whisper (original) output: 2 3 4 5 6 faster-whisper (without --batched): most of the times like original whisper but some times acts exactly like when im using --batched (in this example it acted like with --batched) faster-whisper (with --batched): |
It's not possible that "the results are the same".
How long it's in time?
That's normal, expected behaviour. Vanilla Whisper can output same too. |
True, but 90% of the time Vanilla Whisper outputs as i showed you. Its about 10 hours long. Anything i can do to make it output with --batched like how Vanilla Whisper does ? --sentence doesnt seem to do anything as far as i tested , you can test the file i linked. I guess i need more RAM to process files this long? With --batched works great and 3x faster but i dont like the output as i showed you. |
Then it's this issue -> #1234
Already wrote to you what to do. |
I just shared the output there form test.mp3 --batched --sentence (and with --standard and without --batched for this file is the same) [00:00.000 --> 00:28.780] William Deal, 27. Pista 2. 28. Jenny părăsi hotelul înainte de ora 8 dimineața și, în drum spre destinație, schimbă trei taxiuri diferite. Era o șmecherie pe care o învățase de la Avrum. Plătea din timp taxiul, apoi sărea pe neașteptate din mașină, se srecura printre clădiri, lua un alt taxi, apoi repeta figura încă o dată. Transcription speed: 21.36 audio seconds/s This is from vanilla: [00:00.000 --> 00:02.880] William Deal, 27 Edit: more from the original file 1 2 3 4 5 And from vanilla: 1 2 3 4 5 6 7 8 Some more diff: Vanilla: 29 Faster: |
Share the srt file, not the copy/pastes.
I don't need it. |
The one from the 10 hours file or from the cut (test.mp3?) |
From any file, where you used --batched --sentence. |
test2.mp3 --language Romanian --device cuda --model large-v3-turbo --output_format srt --batched --sentence Hmm , i did a small cut and tried again , the output looks OK now, ill do it again for the full 10 hour file and ill tell you the result , maybe i only looked on the output when using --sentence instead of checking the file (will take like 20 min with batched i think) I had to rename the file to .txt |
No need, I told you that it's impossible. |
Im still going to try it now to be sure , if its working with --batched --sentence it fixes both of my problems. Thank you, sorry for the post i thought its a bug for sure. Edit: Seems that it works with --batched --sentence , shows as vanilla in the file (not in the output screen) , you saved me a lot of time, thanks again. I will ask you another quick question if you dont mind , is large-v3-turbo better than medium? It seems faster not sure if more accurate. |
For the transcriptions turbo should be better. |
I’m encountering an issue when using the --batched option in faster-whisper. Without --batched, the transcription processes one sentence at a time, displaying them sequentially, which works well for my use case. However, when I enable --batched, many sentences are displayed in a single segment, making the output harder to follow and less readable.
Additionally, when I don’t use --batched, my RAM usage spikes to full capacity during the file loading phase , i have 32G of RAM, making my computer unresponsive for a few minutes. However, when I enable --batched, it works smoothly, though I still experience the issue with the output being harder to read. This issue doesn’t happen when using normal Whisper.
Thanks
The text was updated successfully, but these errors were encountered: