Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the network client #1269

Merged
merged 35 commits into from
Aug 23, 2024
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
2927a28
add integration test for client
MehmedGIT Aug 6, 2024
8e7cd3e
fix the test dir path in docker
MehmedGIT Aug 6, 2024
bd16dd7
update network client
MehmedGIT Aug 6, 2024
b2c0675
integration test for client
MehmedGIT Aug 6, 2024
db6e566
Fix flag typo
MehmedGIT Aug 6, 2024
bec81ba
try docker host ip
MehmedGIT Aug 6, 2024
4815896
remove the client server
MehmedGIT Aug 9, 2024
cb3460f
refactor status checks
MehmedGIT Aug 9, 2024
920c1a9
fix test
MehmedGIT Aug 9, 2024
2a843a8
fix: client processing request
MehmedGIT Aug 9, 2024
3a238a7
add: client workflow run
MehmedGIT Aug 9, 2024
50794f9
add timeout and wait to configs
MehmedGIT Aug 9, 2024
cc06fc3
Update src/ocrd_network/client_utils.py
MehmedGIT Aug 12, 2024
4115937
refine status check methods
MehmedGIT Aug 12, 2024
0136db0
add help for new env
MehmedGIT Aug 12, 2024
734bbf0
add cli job status check
MehmedGIT Aug 13, 2024
f86bc23
add: help section to the cli
MehmedGIT Aug 13, 2024
4194f9f
fix: required job id
MehmedGIT Aug 13, 2024
97b3eea
add docstring to cli commands
MehmedGIT Aug 13, 2024
8e7ba26
Fix: rename to block
MehmedGIT Aug 13, 2024
69808b6
Fix: server_utils.py > 404 to 400
MehmedGIT Aug 13, 2024
4de1e83
fix: set ps address if None in constructor
MehmedGIT Aug 13, 2024
d1af85b
fix: check report validation outside try block
MehmedGIT Aug 13, 2024
50f73c5
fix: the annoying string dict
MehmedGIT Aug 13, 2024
8f2861c
add: parameter_override
MehmedGIT Aug 13, 2024
06a371c
add sort to network agents
MehmedGIT Aug 13, 2024
4d85970
add: discovery cli, processors and processor
MehmedGIT Aug 13, 2024
bb3007d
add: check processing job log file
MehmedGIT Aug 13, 2024
ff4243f
fix: exception handling
MehmedGIT Aug 14, 2024
5f746c1
ocrd network client: parse parameters and overrides
kba Aug 14, 2024
8fc8bff
fix parameter parsing again
kba Aug 14, 2024
d73cfaa
Merge pull request #1270 from OCR-D/fix-parsing
MehmedGIT Aug 20, 2024
15cea57
:memo: changelog
kba Aug 22, 2024
18d743a
Merge branch 'master' into extend-network-client
kba Aug 22, 2024
6608539
refactor client cli: process -> run
MehmedGIT Aug 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions src/ocrd_network/server_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,9 @@ def parse_workflow_tasks(logger: Logger, workflow_content: str) -> List[Processo


def raise_http_exception(logger: Logger, status_code: int, message: str, error: Exception = None) -> None:
logger.exception(f"{message} {error}")
if error:
message = f"{message} {error}"
logger.exception(f"{message}")
raise HTTPException(status_code=status_code, detail=message)


Expand All @@ -210,12 +212,12 @@ def validate_job_input(logger: Logger, processor_name: str, ocrd_tool: dict, job
raise_http_exception(logger, status.HTTP_404_NOT_FOUND, message)
try:
report = ParameterValidator(ocrd_tool).validate(dict(job_input.parameters))
if not report.is_valid:
message = f"Failed to validate processing job input against the tool json of processor: {processor_name}\n"
raise_http_exception(logger, status.HTTP_400_BAD_REQUEST, message + report.errors)
except Exception as error:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this exception be raised? Validators should not raise errors but return a report. If you don't have a specific use case, it might be better to just not do try/except so that potential errors are actually raised and fixed.

Copy link
Contributor Author

@MehmedGIT MehmedGIT Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be extra secure against unexpected errors. The workers or the server (network agents as a service) should ideally never crash due to some error. Especially with an HTTP 500 error on the client side and leaving the network agent in an unpredictable state. The exceptions from problematic requests are still logged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, but errors in

report = ParameterValidator(ocrd_tool).validate(dict(job_input.parameters))

would mean that something is broken in our implementation or in jsonschema. So this would likely not be a fluke that happens once, we would need to stop processing and fix the bug at the source. The only variable factor is the job_input.parameters dict and if user-provided input breaks the validator - which it must absolutely never do - we should fix that in the validator.

Copy link
Contributor Author

@MehmedGIT MehmedGIT Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only variable factor is the job_input.parameters dict and if user-provided input breaks the validator - which it must absolutely never do - we should fix that in the validator.

A failing validator would also probably fail most of the processing/workflow jobs. Input from the user breaking down the entire infrastructure is conceptually wrong to me. Even if all the processing jobs fail, the Processing Server should still respond to other requests such as checking logs of old jobs. We would rather need a mechanism to prevent further submission of processing jobs in case X amount of jobs fail in a row - potentially preventing further requests only to the processing endpoint till it is fixed.

There is also currently no graceful shutdown for the processing server. I.e., once the server dies, anything inside the internal processing cache of the server (not the RabbitMQ) will be lost.

message = f"Failed to validate processing job input against the ocrd tool json of processor: {processor_name}"
raise_http_exception(logger, status.HTTP_400_BAD_REQUEST, message, error)
if report and not report.is_valid:
message = f"Failed to validate processing job input against the tool json of processor: {processor_name}\n"
raise_http_exception(logger, status.HTTP_400_BAD_REQUEST, f"{message}{report.errors}")


def validate_workflow(logger: Logger, workflow: str) -> None:
Expand Down