You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For most pages, the block segmentation finds only a few and very often not any blocks. The blocks which are found do not correspond to a comprehensible segmentation. Often it is only the page number or some non-block. Consider for example
The only “block” which is found by the block segmentation is:
I would be very grateful if you could give me some hints how to improve this result. Maybe you could even try to process this book in your own environment to make sure that nothing is amiss with my setup.
The text was updated successfully, but these errors were encountered:
from your 3 step processing I assume you used the block-segmentation on the binarized images? We noticed ourselves, that the performance becomes considerably worse on binarized images.
We will take a look at your particular example.
However, since I dont remember seeing very similar samples in the training data, there is also the possibility that the model can not generalize to this type of layout/data and some additional fine-tuning might be necessary.
I am running the following workflow on https://digital.slub-dresden.de/werkansicht/dlf/87237/1/(with https://digital.slub-dresden.de/data/kitodo/adrefudio_20253082Z_1907/adrefudio_20253082Z_1907_mets.xml):
ocrd-anybaseocr-crop
)ocrd-anybaseocr-binarize
)ocrd-anybaseocr-block-segmentation
)For most pages, the block segmentation finds only a few and very often not any blocks. The blocks which are found do not correspond to a comprehensible segmentation. Often it is only the page number or some non-block. Consider for example
The only “block” which is found by the block segmentation is:
I would be very grateful if you could give me some hints how to improve this result. Maybe you could even try to process this book in your own environment to make sure that nothing is amiss with my setup.
The text was updated successfully, but these errors were encountered: