Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[prototype] compute orientation on segmentation map #1336

Merged
merged 13 commits into from
Nov 17, 2023

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Oct 2, 2023

This PR:

  • update page orientation detection (Now we calculate the orientation on the segmentation mask - much more robust - and will be more robust until all detection models are trained with rotation augmented samples)
  • update builder (now each page element contains the processed page image - easier for the user if they work with rotated documents for example to grab the orientation corrected page)
  • soft breaking change: now we can use the processed image directly for visualization - straightened pages are now also correctly displayed

Any feedback is super welcome 🤗

open todos:

  • check changed test again
  • profile memory usage if we store the raw image in the Page elements (everything looks fine)
  • improve builder tests

doctr/models/zoo.py Outdated Show resolved Hide resolved
@felixdittrich92
Copy link
Contributor Author

@odulcy-mindee added for a first review (PR is not 100% final)

@felixdittrich92
Copy link
Contributor Author

felixdittrich92 commented Oct 4, 2023

@denivic #1283

@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (6d92df5) 95.81% compared to head (727099d) 95.76%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1336      +/-   ##
==========================================
- Coverage   95.81%   95.76%   -0.05%     
==========================================
  Files         155      155              
  Lines        6948     6950       +2     
==========================================
- Hits         6657     6656       -1     
- Misses        291      294       +3     
Flag Coverage Δ
unittests 95.76% <100.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@felixdittrich92 felixdittrich92 added this to the 0.7.1 milestone Oct 5, 2023
@felixdittrich92 felixdittrich92 self-assigned this Oct 5, 2023
@felixdittrich92 felixdittrich92 added type: enhancement Improvement module: models Related to doctr.models ext: tests Related to tests folder type: breaking change Introducing a breaking change framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: text detection Related to the task of text detection labels Oct 5, 2023
@felixdittrich92 felixdittrich92 changed the title [prototype/draft] compute orientation on seq map [prototype] compute orientation on segmentation map Oct 5, 2023
seg_maps = [
pred.permute(1, 2, 0).detach().cpu().numpy() for batch in predicted_batches for pred in batch["out_map"]
]
if return_maps:
Copy link
Contributor Author

@felixdittrich92 felixdittrich92 Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return preds, seg_maps if return_maps else preds raises an issue

loc_preds = self.det_predictor(pages, **kwargs)
pages = [rotate_image(page, -angle, expand=False) for page, angle in zip(pages, origin_page_orientations)]
# Forward again to get predictions on straight pages
loc_preds = self.det_predictor(pages, **kwargs) # type: ignore[assignment]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sec forward is more save atm
If we have a solution also for upside down and the results from the seg_maps leads to 99%+ correct results we can change this later on and rotate the loc preds instead of the forward

@felixdittrich92
Copy link
Contributor Author

@charlesmindee @frgfm @odulcy-mindee
Some results:

TF:

orientation: 14.036239624023438, rnd: 14
orientation: -12.094757080078125, rnd: -12
orientation: -87.87889862060547, rnd: -88
orientation: -42.07927703857422, rnd: -42
orientation: 11.944175720214844, rnd: 12
orientation: -80.13418579101562, rnd: -80
orientation: 0, rnd: -2
orientation: 37.82828903198242, rnd: 38
orientation: 39.805572509765625, rnd: 40
orientation: 0, rnd: -2
orientation: 64.04123115539551, rnd: 64
orientation: 80.16643619537354, rnd: 80
orientation: 43.89118576049805, rnd: 44
orientation: -79.91940307617188, rnd: -80
orientation: -29.01908302307129, rnd: -29
orientation: 58.967790603637695, rnd: 59
orientation: -61.8629035949707, rnd: -62
orientation: 49.07487487792969, rnd: 49
orientation: -36.7241096496582, rnd: -37
orientation: -4.911787986755371, rnd: -5

PT:

orientation: -2.0901589393615723, rnd: -2
orientation: -45.0, rnd: -45
orientation: 0, rnd: 1
orientation: -38.1572265625, rnd: -38
orientation: 11.129188537597656, rnd: 11
orientation: -39.889583587646484, rnd: -40
orientation: -65.0372085571289, rnd: -66
orientation: 78.46537971496582, rnd: 78
orientation: 13.150947570800781, rnd: 13
orientation: -6.023993015289307, rnd: -6
orientation: -12.094757080078125, rnd: -12
orientation: -69.94390869140625, rnd: -70
orientation: 46.84761047363281, rnd: 47
orientation: -38.08047103881836, rnd: -38
orientation: 32.927860260009766, rnd: 33
orientation: 29.016590118408203, rnd: 29
orientation: 60.88385200500488, rnd: 61
orientation: 71.96571350097656, rnd: 72
orientation: -50.61758041381836, rnd: -51
orientation: 54.46232223510742, rnd: 54

@felixdittrich92 felixdittrich92 force-pushed the orientation-seq-prototype branch from 04bcc59 to 54dd7b5 Compare October 13, 2023 13:34
@felixdittrich92 felixdittrich92 marked this pull request as ready for review October 14, 2023 10:34
@felixdittrich92
Copy link
Contributor Author

@felixdittrich92 todo: Update doc strings

@felixdittrich92
Copy link
Contributor Author

@charlesmindee @frgfm @odulcy-mindee Some results:

TF:

orientation: 14.036239624023438, rnd: 14
orientation: -12.094757080078125, rnd: -12
orientation: -87.87889862060547, rnd: -88
orientation: -42.07927703857422, rnd: -42
orientation: 11.944175720214844, rnd: 12
orientation: -80.13418579101562, rnd: -80
orientation: 0, rnd: -2
orientation: 37.82828903198242, rnd: 38
orientation: 39.805572509765625, rnd: 40
orientation: 0, rnd: -2
orientation: 64.04123115539551, rnd: 64
orientation: 80.16643619537354, rnd: 80
orientation: 43.89118576049805, rnd: 44
orientation: -79.91940307617188, rnd: -80
orientation: -29.01908302307129, rnd: -29
orientation: 58.967790603637695, rnd: 59
orientation: -61.8629035949707, rnd: -62
orientation: 49.07487487792969, rnd: 49
orientation: -36.7241096496582, rnd: -37
orientation: -4.911787986755371, rnd: -5

PT:

orientation: -2.0901589393615723, rnd: -2
orientation: -45.0, rnd: -45
orientation: 0, rnd: 1
orientation: -38.1572265625, rnd: -38
orientation: 11.129188537597656, rnd: 11
orientation: -39.889583587646484, rnd: -40
orientation: -65.0372085571289, rnd: -66
orientation: 78.46537971496582, rnd: 78
orientation: 13.150947570800781, rnd: 13
orientation: -6.023993015289307, rnd: -6
orientation: -12.094757080078125, rnd: -12
orientation: -69.94390869140625, rnd: -70
orientation: 46.84761047363281, rnd: 47
orientation: -38.08047103881836, rnd: -38
orientation: 32.927860260009766, rnd: 33
orientation: 29.016590118408203, rnd: 29
orientation: 60.88385200500488, rnd: 61
orientation: 71.96571350097656, rnd: 72
orientation: -50.61758041381836, rnd: -51
orientation: 54.46232223510742, rnd: 54

Changed to round see: #1283 (reply in thread)

@felixdittrich92 felixdittrich92 force-pushed the orientation-seq-prototype branch from 4b30b29 to d5c23b3 Compare November 15, 2023 09:19
@felixdittrich92
Copy link
Contributor Author

Only an example (if have tested it with different kinds of documents)

rot_test

orientation angle: 21

after straighten:

Screenshot from 2023-11-16 08-54-23

doctr/models/_utils.py Outdated Show resolved Hide resolved
doctr/models/_utils.py Show resolved Hide resolved
doctr/models/kie_predictor/pytorch.py Show resolved Hide resolved
doctr/models/predictor/pytorch.py Show resolved Hide resolved
tests/tensorflow/test_models_zoo_tf.py Show resolved Hide resolved
@felixdittrich92 felixdittrich92 merged commit e645ead into mindee:main Nov 17, 2023
66 of 67 checks passed
@felixdittrich92 felixdittrich92 deleted the orientation-seq-prototype branch November 17, 2023 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: tests Related to tests folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend module: models Related to doctr.models topic: text detection Related to the task of text detection type: breaking change Introducing a breaking change type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants