Translating SRT subtitle files #23

dgoryeo · 2024-08-28T11:48:59Z

Hi, I just came acrosss Kudasai. It looks promising.
Is there a way to use Kudasai to translate SRT subtitle files efficiently?
For example in the case below, only the text lines are packed to be sent to the translation and then repacked to map back to the timelines:

1
00:00:01,000 --> 00:00:04,000
これは最初の字幕の例です。

2
00:00:05,000 --> 00:00:08,000
次の字幕はこれです。

The text was updated successfully, but these errors were encountered:

Bikatr7 · 2024-08-28T15:34:16Z

Hi @dgoryeo

So not off of the bat no.

But I managed to get it working by editing the custom instructions and putting your text in a TXT file.

Using GPT with the following settings

{
    "base translation settings": {
        "prompt_assembly_mode": 1,
        "number_of_lines_per_batch": 48,
        "sentence_fragmenter_mode": 2,
        "je_check_mode": 2,
        "number_of_malformed_batch_retries": 1,
        "batch_retry_timeout": 700,
        "number_of_concurrent_batches": 2,
        "gender_context_insertion": false,
        "is_cote": false
    },

    "openai settings": {
        "openai_model": "gpt-4-turbo",
        "openai_system_message": "As a Japanese to English subtitle translator, translate Japanese into English, everything else should remain in its original tense. You will receive text in roughly the format of '1 [newline] 00:00:01,000 --> 00:00:04,000 [newline] こんにちは。', in which you will only translate the Japanese and keep the rest as it was. In that case you would return '1 [newline] 00:00:01,000 --> 00:00:04,000 [newline] Hello.' The real text would have newlines which you would preserve. Keep pre-translated terms and anticipate names not replaced. Match the output's line count to the input's.",
        "openai_temperature": 0.3,
        "openai_top_p": 1.0,
        "openai_n": 1,
        "openai_stream": false,
        "openai_stop": null,
        "openai_logit_bias": null,
        "openai_max_tokens": null,
        "openai_presence_penalty": 0.0,
        "openai_frequency_penalty": 0.0
    },

    "gemini settings": {
        "gemini_model": "gemini-pro",
        "gemini_prompt": "As a Japanese to English translator, translate narration into English simple past, everything else should remain in its original tense. Maintain original formatting, punctuation, and paragraph structure. Keep pre-translated terms and anticipate names not replaced. Preserve terms and markers marked with >>><<< and match the output's line count to the input's. Note: 〇 indicates chapter changes.",
        "gemini_temperature": 0.3,
        "gemini_top_p": null,
        "gemini_top_k": null,
        "gemini_candidate_count": 1,
        "gemini_stream": false,
        "gemini_stop_sequences": null,
        "gemini_max_output_tokens": null
    },

    "deepl settings":{
        "deepl_context": "",
        "deepl_split_sentences": "ALL",
        "deepl_preserve_formatting": true,
        "deepl_formality": "default"
    }
    
}

I was able to give your text as a TXT file and got this as output

1
00:00:01,000 --> 00:00:04,000
This is an example of the first subtitle.
2
00:00:05,000 --> 00:00:08,000
The next subtitle is this one.

To do this more effectively, I imagine you'd have to change Kudasai quite a bit, you'd have to make it work with other files than TXT. Which should be easy, and maybe add an SRT mode that just takes the Japanese out and puts it back in if it is known.

I unfortunately don't have the time to fully look into it, but I'd welcome anyone else to do it.

Bikatr7 · 2024-08-28T15:50:54Z

I won't be personally revisiting Kudasai for some time, as I am super busy and have other commitments but I'll try to add direct support whenever I do as it's an interesting use case.

But until then that should be a good workaround

dgoryeo · 2024-08-29T09:18:01Z

Thanks @Bikatr7. Much appreciated. I'll give it a try. Will let you know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translating SRT subtitle files #23

Translating SRT subtitle files #23

dgoryeo commented Aug 28, 2024

Bikatr7 commented Aug 28, 2024

Bikatr7 commented Aug 28, 2024

dgoryeo commented Aug 29, 2024

Translating SRT subtitle files #23

Translating SRT subtitle files #23

Comments

dgoryeo commented Aug 28, 2024

Bikatr7 commented Aug 28, 2024

Bikatr7 commented Aug 28, 2024

dgoryeo commented Aug 29, 2024