This repository has been archived by the owner on Sep 25, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 14
Deep learning model for OCR of document fields
License
rossumai/OCkRE
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
OCkRE is a deep neural model for OCR of image crops containing arbitrary values rather than making text assumptions. In an OCR pipeline, you first segment the page, detect lines, then run the OCR on a straight segment to transcribe text. OCkRE does only this last part, you cannot use it on a whole page - it's unsuitable even for whole lines, but is designed to only transcribe shorter data fields. OCkRE is originally based on Keras' image_ocr. The architecture is quite similar, but the data is different, synthetic data support is included, and more extensive augmentation is performed. OCkRE within Rossum' pipeline currently uses a tweaked architecture as well as significantly more advanced augmentation. Our plan is to eventually synchronize our internal version and the GitHub version, but we don't have a timeline for that yet. Dependencies: - Python 2.7 - Keras 1.2.2 - Cairocffi - Matplotlib - PIL - Tensorflow (functional CUDA pipeline with a discrete NVidia GPU highly recommended for training, not necessary for classification). - Further dependencies not met by default installation might vary depending on user's operating system/distribution. Only testing has been done on Linux Ubuntu! Missing type faces shouldn't lead to crashes but may lead to visually empty or garbled synthetic samples which will devaluate the training data. OCkRE comes in form of three python modules, with two auxiliary python scripts serving as a quick way of testing the main functionality of OCkRE with no additional user input required. The three OCkRE system modules provided are as follows; - ockre.py which contains the classificator function, training function, and some of the training utilities. - synthset.py which contains what's left of the dataset handling system; more or less just the skeletal functioning of the dataset iterator, operating in a more or less dummy mode, as all samples are always made synthetically - fakestrings.py contains assortment of functions for creation of synthethic text strings to be used for synthetic sample generation This release also includes two auxiliary testing scripts with no external options - they have to be manually modified for other than default functionality. - quicktest.py (run with "python2 quicktest.py") which initialises the OCkRE classifier, loads the packaged model weights, synthesises five training sample like images and performs classification on them, saving the results as .png raster files which show off the synthethic sample and the gold label string used for it's generation as well as OCkRE's classification of the synthethic raster image - traintest.py (run with "python2 traintest.py") which initialises the OCkRE classifier and attempts to train it from the start. Log of the resulting training as well as resulting weights files are saved in img_ocr directory, which the traintest.py creates. For operatively using the weights, user has to modify quicktest.py. - densified_labeltype_best.h5 This is a packaged model + trained weights, which should allow immediate classification.
About
Deep learning model for OCR of document fields
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published