Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to ignore accents for dictionary lookup #94

Open
1over137 opened this issue Mar 14, 2023 · 0 comments
Open

Option to ignore accents for dictionary lookup #94

1over137 opened this issue Mar 14, 2023 · 0 comments

Comments

@1over137
Copy link

1over137 commented Mar 14, 2023

Original issue on KOReader repository: koreader/koreader#10202. KOReader developers asked that I propose the feature to this project instead.

In Russian and many other languages, words have unpredictable stress but is not normally marked, and the headwords in dictionaries are typically without accent. However, some instructive material have accent marked, and there are tools to mark them on ebooks for language learners who may not be familiar with the correct pronunciation. This poses a problem as the fuzzy matching does not appear to be able to match the unmarked version. It would be great either to let the current fuzzy matching be able to do this, or have another explicit option to ignore accent marks.

For this, please ensure that both NF(K)C and NF(K)D normalizations are considered. in NF(K)D mode the accent character is a separate character, while in NF(K)C mode the accent character is combined with the letter to have its own standalone glyph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant