Skip to content

nk2028/yitizi

Repository files navigation

Yitizi

Input a Chinese character. Output all the variant characters of it.
輸入一個漢字,輸出它的全部異體字。
输入一个汉字,输出它的全部异体字。

Usage

Python

pip install yitizi
>>> import yitizi
>>> yitizi.get('和')
['咊', '龢']

JavaScript (Node.js)

npm install yitizi
> const Yitizi = require('yitizi');
> Yitizi.get('和');
[ '咊', '龢' ]

JavaScript (browser)

<script src="https://cdn.jsdelivr.net/npm/yitizi@0.0.3"></script>
> Yitizi.get('和');
[ '咊', '龢' ]

Design

To reduce data redundancy, only the orthodox-character-to-variant-characters mapping is stored in yitizi.csv:

The actual dictionary used by the library is yitizi.json, which is generated by the script build/main.py from yitizi.csv:

Rule 1: If specified in yitizi.csv, that A is an orthodox character, and B is its variant character, then:

  • A is variant character of B in yitizi.json
  • B is variant character of A in yitizi.json

Rule 2: If specified in yitizi.csv, that A is an orthodox character, and B1, B2 are two of its variant characters, then:

  • B1 is variant character of B2 in yitizi.json

For example, if we specify the following relationship in yitizi.csv:

The generated yitizi.json would be:

Note for developers

You need to substitute all the occurrences of the version string before publishing a new release.