OCR-Arabic-Scripts

The aim of this project is to design and train a model that is able to read images of scanned Arabic documents and generate the text written in those images.

Objective
Dataset
Run the Project

Objective

This project implements a complete Machine Learning pipeline, i.e., the project includes (but not limited to) the following modules:

preprocessing module
feature extraction/selection module
model selection and training module
performance analysis module

Dataset

A dataset of images and its ground truth text was obtained from the Watan-2004 Arabic text corpus, compiled by Dr. Mourad Abbas (http://sites.google.com/site/mouradabbas9/corpora)

N.B.: This corpus is only for scientific use. However, any use of it in order to create and release other ressources or software must have the authorization of Mourad Abbas.

Run

Dependencies

python3
numpy
opencv
skimage
scipy
matplotlib

Run the Project

$ python3 ./ocr.py

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.gitignore		.gitignore
Keras Tutorial.py		Keras Tutorial.py
README.md		README.md
classification.py		classification.py
edit.py		edit.py
feature_extraction.py		feature_extraction.py
ocr.py		ocr.py
preprocessing.py		preprocessing.py
seg_accuracy.py		seg_accuracy.py
segmentation.py		segmentation.py
test_model.py		test_model.py
utility.py		utility.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR-Arabic-Scripts

Objective

Dataset

Run

Dependencies

Run the Project

About

Releases

Packages

Contributors 4

Languages

moazshorbagy/OCR-Arabic-Scripts

Folders and files

Latest commit

History

Repository files navigation

OCR-Arabic-Scripts

Objective

Dataset

Run

Dependencies

Run the Project

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages