For generating a synthetic data set from Mongolian song lyrics and dictionary, first install all fonts from fonts. After that, execute the following commands:
mkdir images
./generate_from_dictionary.py > synthetic.csv
./generate_from_lyrics.py >> synthetic.csv
You can also download an already generated synthetic data set from here.
To be released.
Download a pre trained model from here. To make OCR on an image, execute:
python ocr.py --checkpoint image2bichig-epoch-0157.pth test.jpg
ᠮᠢᠨᠦ ᠨᠤᠲᠠᠭ
ᠬᠡᠨᠲᠡᠢ ᠂ ᠬᠠᠩᠭᠠᠢ᠂ ᠱᠣᠶᠣᠨ ᠤ ᠥᠨᠳᠥᠷ ᠰᠠᠶ᠋ᠢᠬᠠᠨ ᠨᠢᠷᠤᠭᠤᠨ ᠤᠳᠨ
ᠬᠣᠶᠢᠲᠤ ᠵᠦᠭ ᠦᠨ ᠴᠢᠮᠡᠭ ᠪᠣᠯᠤᠭᠰᠠᠨ ᠣᠢ ᠬᠥᠪᠴᠢ ᠶᠢᠨ ᠠᠭᠤᠯᠠᠨ ᠤᠳ
ᠮᠠᠨᠠᠨ ᠮᠠᠷᠭᠠ᠂ ᠨᠣᠮᠢᠨ ᠤ ᠥᠷᠭᠡᠨ ᠶᠡᠬᠡ ᠭᠣᠪᠢ ᠤᠳᠨ
ᠡᠮᠦᠨᠡ ᠵᠦᠭ ᠦᠨ ᠮᠠᠩᠯᠠᠢ ᠪᠣᠯᠤᠭᠰᠠᠨ ᠡᠯᠡᠯᠡᠳ ᠮᠠᠩᠬᠠᠨ ᠳᠠᠯᠠᠢ ᠤᠳ
ᠡᠨᠡ ᠪᠣᠯ ᠮᠢᠨᠦ ᠲᠦᠷᠦᠭᠰᠡᠨ ᠨᠤᠲᠤᠭ ᠮᠣᠩᠭᠣᠯ ᠤᠨ ᠰᠠᠶ᠋ᠢᠬᠠᠨ ᠣᠷᠣᠨ
You can try it also online on Colab here.