-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tesseract-recognize creates negative word coordinates #153
Comments
Yes, we should use the
Cannot speak to that. |
The negative x coordinate causes am exception in
|
Should be covered by #152 now, too (sorry for inconvenience). I'll merge as soon as you give approval (again)... |
It now no longer creates negative word coordinates. I still get the |
@bertsky, @kba, the remaining problem with |
It's invalid by the schema, that's why the converter fails. We have already been trying to avoid these circumstances, but in case of @kba do you think we could easily solve that for all processors at once with a fix during serialization in core? The alternative would be adding the following to all segmentation processors (right before ro = pcgts.get_Page().get_ReadingOrder()
if ro and not ro.get_OrderedGroup() and not ro.get_UnorderedGroup():
pcgts.get_Page().set_ReadingOrder(None) |
Yes. OCR-D/core#602 |
Thanks! Closing this issue – solved by #152 |
In a workflow with PPN1024726142,
tesseract-recognize
created a negative coordinate for page 11:ocrd-transform
fails to process that page without an error message when converting from PAGE to ALTO.The text was updated successfully, but these errors were encountered: