-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persistent sync problem #21
Comments
hey @NeroQuill, thanks for the detailed report. I'm going to be adding a validation step to ensure output segments from GPT-4 always match the number of input segments. |
Thanks for your response. I tested it on 2 different files to try to do the translation and both show the synchronization error always at a similar timestamp. I wanted to know if this error only happens to me or if it is happening to more people in other languages or even in Portuguese. Thank you in advance for your work, which is brilliant and very important for the subtitles niche in the world. If you can make this work completely, it will be a perfect job. |
I have the same issue. It always skips some lines so the order number and sentences are not correct. How can we fix it? |
I'm working on a fix, will keep this post updated |
Streamed my attempt here: https://www.youtube.com/live/ScnHkYKvtRE Made some progress by converting the response to JSON, but it still occasionally skips/merges some lines! 🫤 |
Is there any update on this issue? |
Having the same issue. Seems to not be fixed so it's not reliable for day to day use atm :( |
I'm getting the same exact issue using the latest from main branch with gpt-4o model on the very first srt file I tried to translate (haven't tried other models or srt files yet). Somewhere in the middle it gets out of sync and makes the whole translated SRT useless unless I manually go through all of it and fix it which is not reasonable. Other than also eating all line breaks in the original subtitles which also requires manual fixing, it seems to work pretty well at the actual translation part. Too bad it isn't really usable due to these issues without hours of cleanup work which defeats the purpose (also I really don't want to have to read all the lines of something I haven't watched yet since the whole point is to watch with my wife in English with Spanish subtitles). This is sort of the thing with LLMs in general though isn't it... They're incredible when they work, but they only work like 80% on pretty much any task. So close yet so far... |
Good news! I think we finally got this fixed with the latest (experimental) Gemini Flash 2.0 model. Merged and deploying now.. |
I am reporting that even after updating to GPT-4, the synchronization error in the speeches remains similar to what happened before (At least when I try to translate into my language, which is Portuguese).
What it seems is that ChatGPT tends to eat some lines but keep the timestamps, which causes the rest of the lines in the entire file to be extremely out of sync. I don't know if this is a fixable problem, since ChatGPT is the one making the mistake, not your code.
The original text:
![image](https://private-user-images.githubusercontent.com/131259653/303461801-f530e9bb-8315-4a2f-9026-16926ae2cc57.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk4OTc4MjYsIm5iZiI6MTczOTg5NzUyNiwicGF0aCI6Ii8xMzEyNTk2NTMvMzAzNDYxODAxLWY1MzBlOWJiLTgzMTUtNGEyZi05MDI2LTE2OTI2YWUyY2M1Ny5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOFQxNjUyMDZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iY2JjMTAwNzRjNjkzMTZjM2ExZjUxNjdkMDAyNzZjY2E1NjgxYTFiOGNkMDhkZjIxZjJhYmRiNjJjMDhhMGU3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.-PH8mUcRclVfyj25lVNfHxyAHTnHe_B7XVZARLwXhMw)
the translated one:
![image](https://private-user-images.githubusercontent.com/131259653/303462153-3b49661d-c3d1-494a-ba48-fce3a6fedb7f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk4OTc4MjYsIm5iZiI6MTczOTg5NzUyNiwicGF0aCI6Ii8xMzEyNTk2NTMvMzAzNDYyMTUzLTNiNDk2NjFkLWMzZDEtNDk0YS1iYTQ4LWZjZTNhNmZlZGI3Zi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOFQxNjUyMDZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1mYWY0NzFkYzQ5NjlkNmUxZWFmYmI5OTUwODMyZmJhNGViZDA0NmUwMzQ5ZDJmYmJlYjE1YWM1M2Q3OGUyOGE0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.l-PHaemmPNriUex8jp00pgIdmMOEOtdEqQRnwFIdOR4)
The text was updated successfully, but these errors were encountered: