Romanizing each single character after tagging and tokenization #64
Unanswered
yaserahmady
asked this question in
Q&A
Replies: 1 comment
-
If I understand correctly, the issue is that you want to be able to turn sequences like
Since you are working to learn Japanese, however, I would strongly recommend you don't do this and focus on learning kana first without relying on romaji. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello there, great job on cutlet, thank you!
I've made a little tool for myself for learning Japanese and taking notes during class. While learning kanas I like having this tool to quickly romanize some Japanese text and converting it to Ruby characters with a little markup to paste in the Obsidian app (the plugin I use to parse this markup is https://github.com/steven-kraft/obsidian-markdown-furigana).
It mostly works fine but right now this is an example output which I'd love to improve:
にほんご で あいさつ を しましょう
I'd like to have the Ruby text for each single character, like this expected output:
にほんご で あいさつ を しましょう
The issue is that it looks like you're getting the correct romanization based on some rules and unidic so I cannot simply loop through each character and call
katsu.romaji(single_char)
because often the romanized result would be incorrect and also this approach would treat しょ as 2 separate characters.This is my whole script:
This is what I tried to get single character romanization:
Beta Was this translation helpful? Give feedback.
All reactions