You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 29, 2022. It is now read-only.
I've written a little script that tries to infer the reading for each kanji as it is used in a given compound. The way I'm doing it is by pulling all possible combinations of readings from KanjiDic, then running surface_forms over all possible segments to see if one of them matches the known reading.
However, this is occasionally failing because of onbin variation that occurs for the second segment of a word, e.g. 小学校. The reading for this is 小[しょう]学[がっ]校[こう] - the reading for 学 is an onbin variation on its onyomi: ガク. (Note: I'm using the terminology from the library - I'm not familiar enough with Japanese phonology to know what's actually happening on a linguistic level.)
Is this just irregular phonetic variation, or should surface_forms be modified so all but the last segment can undergo onbin variation?
I've written a little script that tries to infer the reading for each kanji as it is used in a given compound. The way I'm doing it is by pulling all possible combinations of readings from
KanjiDic
, then runningsurface_forms
over all possible segments to see if one of them matches the known reading.However, this is occasionally failing because of onbin variation that occurs for the second segment of a word, e.g. 小学校. The reading for this is 小[しょう]学[がっ]校[こう] - the reading for 学 is an onbin variation on its onyomi: ガク. (Note: I'm using the terminology from the library - I'm not familiar enough with Japanese phonology to know what's actually happening on a linguistic level.)
Is this just irregular phonetic variation, or should
surface_forms
be modified so all but the last segment can undergo onbin variation?Here is an MWE:
Which returns:
The text was updated successfully, but these errors were encountered: