-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support material for Hangul filler #4
Comments
This was really insightful! Thanks so much. I left CJK out because I personally have zero understanding of anything CJK- despite reading the reports, CodePoints.net, and Wikipedia. Awesome-Unicode deserves a dedicated section on CJK in addition to an explanation about why the Hangul Filler is a thing ( which you've done quite well). Would be able to contribute a section in CJK? |
@jagracey I think I'm a terrible writer :P (even in my native tongue). I can list some interesting bits about CJK and Unicode, however:
Probably more, but I cannot recall others right now. |
@lifthrasiir so, how many methods available are there to type hangul? I only know two: precomposed syllables vs jamos. |
@nexusanphans You cannot directly write precomposed syllables (there are 11,172 modern ones), there should be multiple keystrokes to complete one syllable. Hangul is simple enough to not require a dictionary-based complex IME necessary for Japanese or Chinese, but is complex enough to allow many clever ideas to optimize for. I'm aware of at least 30+, and most popular ones include:
|
@lifthrasiir Thank you for your detailed answer, although what I meant to inquire was low-level (i.e. at the level of Unicode codepoints) method, for which I only know two: precomposed syllables vs jamos (initial, medial, and final). The former uses the Hangul Syllables block (AC00–D7A3), while the latter uses Hangul Jamo block along with two extended blocks. Precomposed syllables encode the whole syllables, while jamos are analogous to Latin letters, but with separate initial vs final version to allow precise syllable breaking. There is another block, however, that of Hangul Compatibility Jamo (3130–318F). Looking at this Wikipedia page, it seems to be used only for compatibility purpose with another, older encoding system (Unicode is too concerned with backward compatibility, IMO). It is supposed to behave like jamos but with no separation of initial vs final consonants, which may complicate things since such arrangement can form ambiguous syllables. |
That was an intention, but practically compatibility jamos are used everywhere. In the other words, modern Korean IMEs do operate at the combining jamo level before the commit, but the candidate text is materialized as compatibility jamos when they can't combine with each other. The motivating example would be a Korean equivalent to "lol" and "*sob*": "ㅋㅋㅋㅋㅋㅋㅋㅋㅋ" and "ㅠㅠ". They are not intended to compose, so "ㅋㅋㅋㅋ" followed by "ㅠㅠ", once written, should not compose into "ㅋㅋㅋ큐ㅠ" like combining jamos. You can often see that composed form though because people often don't signal that fact (by explicitly committing the text) to the IME. |
This piece is a feedback for Hangul filler section. This was originally posted at Reddit but I guess you can alter the wording to incoporate the following:
(Feel free to ask me about Hangul and more generally CJK support in Unicode.)
The text was updated successfully, but these errors were encountered: