Generate text images for training deep learning ocr model.
The code was only tested on Ubuntu 16.04.
Install dependencies:
pip3 install -r requirements.txt
Run python3 main.py
, images and labels.txt will generate at output/default/
.
Run python3 main.py --help
to see optional arguments.
If some chars in corpus is not supported by your font, your will get bad result:
Run main.py
with --strict
, renderer will retry get sample from corpus until all chars are supported by a font.
Check how many chars your font not support for a charset:
python3 tools/check_font.py
checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['第', '朱', '广', '沪', '联', '自', '治', '县', '驼', '身', '进', '行', '纳', '税', '防', '火', '墙', '掏', '心', '内', '容', '万', '警','钟', '上', '了', '解'...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]
Run python3 main.py --debug
will save images with extract information.
You can see how perspectiveTransform works and all bounding/rotated boxes.
If you want to use GPU to speed up image generating, first compile opencv with CUDA. Compiling OpenCV with CUDA support
Then build Cython part, and add --gpu
options when run main.py
cd libs/gpu
python3 setup.py build_ext --inplace