Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

U+66FF 「替」 形訛矣! #17

Open
Hulenkius opened this issue Aug 4, 2020 · 6 comments
Open

U+66FF 「替」 形訛矣! #17

Hulenkius opened this issue Aug 4, 2020 · 6 comments

Comments

@Hulenkius
Copy link

於子字型之中,「替」 (U+66FF) 乃歸口部。然,口部之「替」者,「𠾱」 (U+20FB1) 也。「𠾱」者,「噆」 (U+5646) 之別寫也。音子感反,嗛也,銜也,同「替」無所似。是以字訛也。究其根本,蓋機器識字昏倦,以致正誤無所辨也。

今試採是「齊伋體」同「新細明體」相較,作一圖幀,恭錄於左:「齊伋體」以首,「新細明體」以繼——冀子孰察之!

@LingDong-
Copy link
Owner

Hi @Hulenkius , you're totally right, I will fix it soon along with other OCR errors. Thanks a lot for pointing out!

@Hulenkius
Copy link
Author

(Unicode Standard)
@LingDong- 余複有一問:值此字型中,乃大有一形多碼之字,皆以別寫稱也,雖可示其新異,然與萬國碼之繩墨相違。遂請新闢一字型,使碼形相符,正異得所。正形者使寘正碼,異形者使寘其本碼,用也則徑錄其異形之碼,豈不美哉?

@Hulenkius
Copy link
Author

@LingDong- 「霏」誤作「霑」。OCR之誤。

@LingDong-
Copy link
Owner

Thanks @Hulenkius for pointing out the 霏 mistake! Will fix.

The unicode correspondence also sounds like a good idea, but could potentially take a lot of work identifying them manually, and since the source books uses 異體 extensively, most of them might end up in obscure unicode endpoints while the most commonly used ones are missing and has to be fallbacks. I'll do some checks (it's been a while since I last touched the project) and see what I can do!

@Hulenkius
Copy link
Author

余複尋得數訛字。表中,左縱爲原字,右縱爲所訛作字。
刊 | 可
剌 | 刺
勑 | 𤍂
卬 | 卭
啖 | 㗖
啓 | 啟
啟 | 啓
怂 | 怱
斡 | 榦

此幀示「思源黑體」之較於齊伋體。
螢幕快照_2020-08-07_23-43-37

@LingDong-
Copy link
Owner

Hi @Hulenkius ,

Thanks so much for finding all these errors! Now fixed in the new release: https://github.com/LingDong-/qiji-font/releases/tag/v0.0.2

Some details:

  • 替、卬、勑: These don't appear in the scanned books, so they are now using the fallback version generated from Source Han Serif. 卭、𤍂: These are now given the correct unicode.
  • 刊 does not appear in the scanned books, but the 異體, 栞 does. So now both 栞 and 刊 maps to the same glyph (This is the consistent strategy this project uses to cover most codepoints with authentic glyphs)
  • 怱 Now corrected. Also mapped 匆, the more common form to the same glyph, same as above.
  • 啓啟: Funny error. Now swapped.
  • 啖㗖: I believe the latter is a 異體 of the former (source) so they map to the same glyph.
  • 斡榦(and 𠏉): Very confusing, but I think I've now fixed it right: 𠏉 and 斡 are equivalent (異體), while 榦 is a different characeter.

New rendering:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants