Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tiseg #79

Merged
merged 23 commits into from
May 19, 2021
Merged

Fix tiseg #79

merged 23 commits into from
May 19, 2021

Conversation

bertsky
Copy link
Contributor

@bertsky bertsky commented Feb 1, 2021

This brings about the bare minimum functionality of this processor.

Results are not that good, but that's another story. (Legacy text/non-text segmentation does not detect enough non-text, so the recall is bad but precision is good. Deep ML segmentation however fails miserably as it does not even preserve most foreground at all, plus precision is equally bad as recall.)

@bertsky
Copy link
Contributor Author

bertsky commented Feb 1, 2021

Results are not that good, but that's another story. (Legacy text/non-text segmentation does not detect enough non-text, so the recall is bad but precision is good. Deep ML segmentation however fails miserably as it does not even preserve most foreground at all, plus precision is equally bad as recall.)

See #80 for a discussion about that.

- drop unused and dysfunctional code against overlaps
- drop wrong reading order algorithm
- improve mask post-processing (closing instead of dilation)
- make mask-polygon conversion optional
- add optional post-processing to reduce overlaps
  (bbox-only or mask-based):
  - non-maximum suppression across classes (min_iou_drop)
  - non-maximum merging across classes (min_iou_merge)
  - within-other suppression across classes (min_share_drop)
  - within-other merging across classes (min_share_merge)
- implement correct reading order algorithm
  (bbox-only or mask-based):
  - partial order constraints under lr-tb assumption
  - topological sort
- annotate confidence along with coordinate results
@bertsky
Copy link
Contributor Author

bertsky commented Feb 4, 2021

Here's more, bringing some sanity to block segmentation. (Changes to the mrcnn part loosely derive from my Mask_RCNN fork.)

Again, this is not about model quality itself, but what best to make of it. For concerns about training/model quality, cf. last comments in #82.

@bertsky
Copy link
Contributor Author

bertsky commented Feb 4, 2021

Regarding tiseg again:

But does any consuming processor actually make use of the alpha channel? I highly doubt it.

Since the model was obviously trained on raw images, we have to apply it on raw images. But we can still take binarized images (from a binarization step in the workflow) to apply our resulting mask – by filling with white.

That seems like the better OCR-D interface to me. (Of course, contour-finding and annotation via coordinates would still be better than as clipped derived image.) What do you think, @kba?

I already have a commit for this in line, i.e. merely comparing the text score with image score (and ignoring background score), and then white-filling and alpha-masking the image parts in the clipped raw result. Should I add that?

@bertsky
Copy link
Contributor Author

bertsky commented Feb 4, 2021

BTW, we still do have a problem with loggers here. In 5a4d874 I had to remove the initLogging from tensorflow_importer, as this would cause a large and unfriendly error message about doing initLogging twice. But now, the TF logger does not adhere to OCR-D conventions, and what's worse, spoil -h and -J output again.

So how do we do this properly, @kba? Use our getLogger, but only import tensorflow_importer in process?

@kba
Copy link
Member

kba commented Feb 4, 2021

I already have a commit for this in line, i.e. merely comparing the text score with image score (and ignoring background score), and then white-filling and alpha-masking the image parts in the clipped raw result. Should I add that?

IIUC (and that's a big if) then yes, please.

BTW, we still do have a problem with loggers here. In 5a4d874 I had to remove the initLogging from tensorflow_importer, as this would cause a large and unfriendly error message about doing initLogging twice. But now, the TF logger does not adhere to OCR-D conventions, and what's worse, spoil -h and -J output again.

So how do we do this properly, @kba? Use our getLogger, but only import tensorflow_importer in process?

The longer I have to deal with the not initialized or initialized twice logged errors, the less I am convinced that it was a good idea :( But in this case, dropping the initLogging() from and importing only at process tensorflow_importer seems reasonable. It might cause runtime errors due to unavailable dependencies but also saves time for non-processing calls.

@bertsky
Copy link
Contributor Author

bertsky commented Feb 4, 2021

The longer I have to deal with the not initialized or initialized twice logged errors, the less I am convinced that it was a good idea :( But in this case, dropping the initLogging() from and importing only at process tensorflow_importer seems reasonable. It might cause runtime errors due to unavailable dependencies but also saves time for non-processing calls.

I get the same feeling. Also, processing vs non-processing context and module-level imports: Implementing process() almost always drags in full dependencies, but splitting a (Processor) class across modules is impossible. We are even lucky -J does work at all for now: Every keras processor will invariably print("Using TensorFlow backend"). We cannot rule such things out ever. – No idea how to address this.

@kba
Copy link
Member

kba commented Feb 4, 2021

Every keras processor will invariably print("Using TensorFlow backend")

I have been actively defending against that with hacks like tensorflow_importer or

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
stderr = sys.stderr
sys.stderr = open(os.devnull, "w")
from keras import backend as K
from keras.models import load_model
sys.stderr = stderr
import tensorflow as tf
tf.get_logger().setLevel("ERROR")
warnings.filterwarnings("ignore")

It's quite a mouthful to stop those import-time print statements but I see no way around it for --dump-json etc.

@kba kba merged commit 0fde740 into OCR-D:master May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants