Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing information on text orientation (hOCR property textangle) with LSTM #2303

Closed
stweil opened this issue Mar 8, 2019 · 4 comments
Closed
Labels
OSD Orientation and Script Detection
Milestone

Comments

@stweil
Copy link
Member

stweil commented Mar 8, 2019

Tesseract 4 generates the required property textangle with the old OCR engine, not when using LSTM. See ocropus/hocr-tools#148 for examples and more details.

@stweil stweil added this to the 4.1.0 milestone Mar 8, 2019
@stweil stweil changed the title Missing information on text orientation (hOCR property textrotation) with LSTM Missing information on text orientation (hOCR property textangle) with LSTM Mar 8, 2019
@stweil
Copy link
Member Author

stweil commented Mar 8, 2019

The property should be written here.

@tesseract-ocr tesseract-ocr deleted a comment from GSATHYANARAYANA May 5, 2020
@stweil
Copy link
Member Author

stweil commented May 5, 2020

@bertsky, the information is missing in the ALTO XML (which has a ROTATION tag) as well.

@bertsky
Copy link
Contributor

bertsky commented May 5, 2020

Interesting. This should already write the orientation angle (i.e. multiple of 90°) on current master. And it uses PageIterator.Orientation(), which is not related to (legacy) OSD, but the normal page segmentation included in PSM_AUTO, PSM_AUTO_OSD, PSM_AUTO_ONLY and PSM_SPARSE_TEXT_OSD. So it should work with any LSTM model loaded. I wonder why this issue even exists. (Have not tried it myself yet.)

But then, why does the code only use orientation value, but not add the skew angle?

@stweil Sure I'll do the same thing in #2815 as well.

@stweil
Copy link
Member Author

stweil commented May 6, 2020

This should already write the orientation angle (i.e. multiple of 90°) on current master.

Indeed, it is fixed in 4.1.1 and master which write the textangle tag, so I close this issue.

@stweil stweil closed this as completed May 6, 2020
@amitdo amitdo added the OSD Orientation and Script Detection label May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OSD Orientation and Script Detection
Projects
None yet
Development

No branches or pull requests

3 participants