Mongolian-text-recognition

The Mongolian alphabet of this dataset is "ᠴ᠊ᠬ ᠲᠵᠺ ᠷᠯᠢ᠎ᠫ ᠾ ᠱ ᠹ ᠽᠸᠧᠣᠪ ᠥᡁᠡᠰ ᠨ ᠤᠠᠼᠳ ᠦᠩ ᠶ ᠭ᠍ᠮ᠋ ᠦ‍". The data of this work comes from the China Mongolian News Network(http://www.mgyxw.net), which automatically divides Mongolian pictures into individual words by writing a Python script, and saves them as images after binarization. The dataset has a total of 30326 origin Mongolian text pictures, and the vocabulary is 6538. We use rotation, horizontal wave, distortion, and perspective methods to distort the Mongolian data. After data augmentation, the dataset has 98,085 Mongolian pictures. Mongolian-Regular3W.rar is the original data. Mongolian-Irregular0.65W.rar is the data that each vocabulary is enhanced once. Mongolian-Irregular6.5W.rar is the enhanced data of Mongolian-Regular3W.rar.

The data format is label.jpg, the label.txt file contains the path and label of each image.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Mongolian-Irregular0.65W.rar		Mongolian-Irregular0.65W.rar
Mongolian-Irregular6.5W.rar		Mongolian-Irregular6.5W.rar
Mongolian-Regular3W.rar		Mongolian-Regular3W.rar
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mongolian-text-recognition

About

Releases

Packages

ShaoDonCui/Mongolian-text-recognition

Folders and files

Latest commit

History

Repository files navigation

Mongolian-text-recognition

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages