Decrease reliance on non-Python APIs #2

jczaplew · 2018-08-20T20:10:02Z

This could be streamlined somewhat by using something like tesserocr or pyocr instead of using shell scripts.

Additionally, it would be great if there were a way to extract entities from a PDF without needing to run preprocess.sh to convert each page to an image and run tesseract on it.

Ghostscript - https://stackoverflow.com/a/36113000/1956065

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decrease reliance on non-Python APIs #2

Decrease reliance on non-Python APIs #2

jczaplew commented Aug 20, 2018 •

edited

Loading

Decrease reliance on non-Python APIs #2

Decrease reliance on non-Python APIs #2

Comments

jczaplew commented Aug 20, 2018 • edited Loading

jczaplew commented Aug 20, 2018 •

edited

Loading