Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SavedModel instead of HDF5 format, fix dewarping #89

Merged
merged 17 commits into from
Feb 22, 2022

Conversation

bertsky
Copy link
Contributor

@bertsky bertsky commented Feb 19, 2022

On Python 3.8, you get errors trying to load the existing HDF5 models for Tensorflow processors tiseg and layout-analysis.

However, Tensorflow offers a more stable alternative: SavedModel directories. I have converted the existing models an adapted the code to make them runnable again.

Now, how do we redistribute these? I have uploaded them as tarballs here and here. But really they should go to https://ocr-d-repo.scc.kit.edu/models/dfki as well.

As soon as we get OCR-D/core#800 done, we should then be able to update the resource list in ocrd-tool.json, right?

Another dependency is in the processors using ocrolib.morph, i.e. nlbin and textline: OCR-D/ocropy#2@kba, as soon as you have merged and published ocrd-fork-ocropy==1.4.0a4, this is ready to go.

- move model loading into `setup` in constructor context
- allow directories as models (TF SavedModel format), too
- use correct pageId
- simplify and polish
use custom dataset class for in-memory PIL.Image passing
instead of file-based repurposed `AlignedDataset` (since
(this is faster, and reliable: OCR-D does not guarantee us
 a `.filename` for derived images; also, does not create
 temporary files in the input fileGrp anymore)
after decoding, convert tensor to array with due respect for
proper channel and dynamic range coding (instead of ad-hoc
conversion); then resize while still in RGB and re-binarize
(instead of ad-hoc binarization followed by resizing in binary)
- rebase on pix2pixHD#293 (CPU-only option, Torch>=1.0,
  less verbose, arg passing)
- pass args to pix2pixHD directly (instead of sys.args
  hijacking)
- no unneccesary verbosity (and only through loggers)
- move model loading into startup context via `setup` fn
- rename params:
  * `imgresize` → `resize_mode`,
  * `resizeHeight` → `resize_height`
  * `resizeWidth` → `resize_width`
- add proper documentation
- fix region-level results
(just BIN is not enough / not as good / not realistic)
@bertsky bertsky changed the title Use SavedModel instead of HDF5 format Use SavedModel instead of HDF5 format, fix dewarping Feb 20, 2022
@bertsky
Copy link
Contributor Author

bertsky commented Feb 20, 2022

Now also depends on NVIDIA/pix2pixHD#293, and contains various other fixes, mostly regarding dewarping.

Fixes #34, #35, #40, #60, #61, #72, #73, #77, #87, #88, and probably #42 (see below – with resize_mode=none).

With better upsampling/re-binarization, the quality of the dewarper has also improved a little. It is obviously not a good idea to downsample in the first place (which is the case with the default resize_mode=resize_and_crop). But one could always increase resize_width/resize_height, or use resize_mode=none to gain full size quality at the cost of higher memory and time demand.

Here are some examples based on the dfki-testdata test case (after binarization and cropping):

dewarped with default settings:

before after
dfki-crop-test dfki-dewarp-test-bin

dewarped with default settings but on GPU:

before after
dfki-crop-test dfki-dewarp-test-bin-gpu

dewarped with larger size (less resampling/interpolation):

before after
dfki-crop-test dfki-dewarp-test-bin-large

dewarped with original/full image size:

before after
dfki-crop-test dfki-dewarp-test-bin-full

dewarped on cropped but raw RGB (just to show that the models have not been trained on such data):

before after
dfki-crop-test dfki-dewarp-test-raw

@kba kba merged commit 01aea45 into OCR-D:master Feb 22, 2022
@bertsky
Copy link
Contributor Author

bertsky commented Feb 22, 2022

Now, how do we redistribute these? I have uploaded them as tarballs here and here. But really they should go to https://ocr-d-repo.scc.kit.edu/models/dfki as well.

Like I said, we still need to upload the new models, and update the resource URLs. (This is the reason the CI still fails.)

@bertsky
Copy link
Contributor Author

bertsky commented Feb 22, 2022

Fixes #34, #35, #40, #60, #61, #72, #73, #77, #87, #88, and probably #42

BTW I forgot to link these (and my formulation is not covered by autolinking). Please close them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants