You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The --help of ocrd-anybaseocr-tiseg states a default wiring of ['OCR-D-IMG-CROP'] -> ['OCR-D-SEG-TISEG'].
root@38fa7aad0b43:/data/ocrd_workspace# ocrd-anybaseocr-tiseg --help
Using TensorFlow backend.
Usage: ocrd-anybaseocr-tiseg [OPTIONS]
separate text and non-text part with anyBaseOCR
Options:
-V, --version Show version
-l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
Log level
-J, --dump-json Dump tool description as JSON and exit
-p, --parameter TEXT Parameters, either JSON string or path
JSON file
-g, --page-id TEXT ID(s) of the pages to process
-O, --output-file-grp TEXT File group(s) used as output.
-I, --input-file-grp TEXT File group(s) used as input.
-w, --working-dir TEXT Working Directory
-m, --mets TEXT METS to process
-h, --help This help message
Parameters:
"operation_level" [string - page] PAGE XML hierarchy level to operate
on Possible values: ["page", "region", "line"]
Default Wiring:
['OCR-D-IMG-CROP'] -> ['OCR-D-SEG-TISEG']
The workspace contains a file group named OCR-D-IMG-CROP, a corresponding folder exists.
I would expect that running orcd-anybaseocr-tiseg without any arguments would default to using OCR-D-IMG-CROP as input and OCR-D-SEG-TISEG as output. However, the program fails with the following error, because its using the non-existing INPUT as input and OUTPUT as output file group.
root@38fa7aad0b43:/data/ocrd_workspace# ocrd-anybaseocr-tiseg -m mets.xml
Using TensorFlow backend.
09:22:34.382 INFO ocrd.workspace_validator - input_file_grp=['INPUT'] output_file_grp=['OUTPUT']
Traceback (most recent call last):
File "/usr/bin/ocrd-anybaseocr-tiseg", line 8, in <module>
sys.exit(ocrd_anybaseocr_tiseg())
File "/usr/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/ocrd_anybaseocr/cli/cli.py", line 37, in ocrd_anybaseocr_tiseg
return ocrd_cli_wrap_processor(OcrdAnybaseocrTiseg, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/ocrd/decorators.py", line 53, in ocrd_cli_wrap_processor
raise Exception("Invalid input/output file grps:\n\t%s" % '\n\t'.join(report.errors))
Exception: Invalid input/output file grps:
Input fileGrp[@USE='INPUT'] not in METS!
You are right, this should work as you expect. (At least as long as we keep describing it as default wiring.) But this has not been implemented yet in ocrd (the base package), cf. OCR-D/core#274.
You have to call with explicit input and output file groups for now.
The
--help
ofocrd-anybaseocr-tiseg
states a default wiring of['OCR-D-IMG-CROP'] -> ['OCR-D-SEG-TISEG']
.The workspace contains a file group named
OCR-D-IMG-CROP
, a corresponding folder exists.I would expect that running
orcd-anybaseocr-tiseg
without any arguments would default to usingOCR-D-IMG-CROP
as input andOCR-D-SEG-TISEG
as output. However, the program fails with the following error, because its using the non-existingINPUT
as input andOUTPUT
as output file group.From what I can tell, this is due to
class OcrdAnybaseocrTiseg(Processor)
not overridinginput_file_grp
andoutput_file_grp
in__init__
, along the lines of:The text was updated successfully, but these errors were encountered: