Line Remover NN 🚀

Introduction

This repos uses PyTorch to remove ruled lines from an image while reconstructing overlapping characters with lines. The goal of this model is to make easier the word recognition from OCR

Requirements

🐍 python >3.9 (not sure need testing)

Install Requirements

python -m pip install -r requirements.txt

Install IAM Dataset 🗒️

First go to /data/ and run python downloadData.py

Preprocess the data (generated pages and split pages to blocks)

Generate synthetic pages

Stay in /data/ directory and run python MakeDataset.py --output [output default: ./] --pages [number of pages to generate default: 1000] --split [split or no directly the pages default: False]

Split pages into blocks (of 512x512) (Not necessary if --split was specified earlier)

Run python processBlock.py --dir [directory where pages are and where will they be generated default: ./]

Train Model 🧑‍🏫

Run python train.py --epoch [number of epochs to train default: 50] --dataset [dataset path] ?--load[Load or not the best saved model]

Usage

You can use infer.py functions such as: processImg, processImgs, splitAndProcessImg See docs directly on functions descs,

Inspiration

Model Structure : Gold, C., Zesch, T. (2022). CNN-Based Ruled Line Removal in Handwritten Documents. In: Porwal, U., Fornés, A., Shafait, F. (eds) Frontiers in Handwriting Recognition. ICFHR 2022

Word Recognition model for Eval : MLTU Tutorials

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
__pycache__		__pycache__
data		data
images		images
models		models
.gitignore		.gitignore
LICENSE		LICENSE
__init__.py		__init__.py
infer.py		infer.py
main.py		main.py
postProcessing.py		postProcessing.py
readme.md		readme.md
requirements.txt		requirements.txt
test.py		test.py
torch.log		torch.log
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Line Remover NN 🚀

Introduction

Requirements

Install Requirements

Install IAM Dataset 🗒️

Preprocess the data (generated pages and split pages to blocks)

Generate synthetic pages

Split pages into blocks (of 512x512) (Not necessary if --split was specified earlier)

Train Model 🧑‍🏫

Usage

Inspiration

About

Releases

Packages

Languages

License

PastaLaPate/LineRemoverNN

Folders and files

Latest commit

History

Repository files navigation

Line Remover NN 🚀

Introduction

Requirements

Install Requirements

Install IAM Dataset 🗒️

Preprocess the data (generated pages and split pages to blocks)

Generate synthetic pages

Split pages into blocks (of 512x512) (Not necessary if --split was specified earlier)

Train Model 🧑‍🏫

Usage

Inspiration

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages