Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually correcting segmentation #323

Open
nederhof opened this issue Jan 27, 2019 · 2 comments
Open

Manually correcting segmentation #323

nederhof opened this issue Jan 27, 2019 · 2 comments

Comments

@nederhof
Copy link

nederhof commented Jan 27, 2019

I know that the files *.pseg.png store the coordinates of the automatic line segmentation. I have seen mention on the Web of the use of GIMP for manipulating these coordinates, but without further details on how this is done. When I open GIMP on these files, I see nothing that shows the segmentation, nor anything that I can edit manually.

For background: I am trying to do OCR for some documents that are in a poor state, with smudges and faded ink, and no matter how much image preprocessing I do, automatic segmentation fails on at least some parts of the page. I see manual adjustment as the only viable way forward. That is, I would like to manually remove lines, add new lines, change the positions of lines, and ideally also change the order of lines, before the actual OCR is done. Can I insert this manual correction into the usual OCRopus workflow?

Thanks in advance for your time.

Mark-Jan Nederhof

@wrznr
Copy link

wrznr commented Jan 31, 2019

@nederhof I don't think that this is possible since ocropy works on images rather than metadata (i.e. some files indicating the positions of the segments in terms of coordinates). What you could do however is using a 3rd party tool like Transkribus or Aletheia to do the initial layout/line recognition. Both tools offer options for manual post-correction. When your done, export the result, extract the lines from your original images, and run ocropy for text recognition.

Come to think of it, you could also use ocropus-hocr to create hOCR files for your initial recognition. Import them to Aletheia, correct the segmentation, export, rerun ocropy.

@zuphilip
Copy link
Collaborator

zuphilip commented Mar 5, 2019

The *.pseg.png files are normal PNG files where the color code is used to encode the information about layout, see here for more information https://github.com/tmbdev/ocropy/wiki/OCRopus-File-Formats#physical-layout .

Since some time you can now also use masks to help the layout segmentation. I think this is now yet documented well, but you can have a look at the initial pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants