Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test for ocrd-network #1184

Merged
merged 27 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
eb0da74
Add a test for workflow run in ocrd_all
joschrew Feb 7, 2024
79c5b79
remove duplicates
MehmedGIT Feb 12, 2024
3d501b5
Make make assets in Dockerfile skipable
joschrew Feb 12, 2024
7f77b57
Add a test for workflow run in ocrd_all
joschrew Mar 13, 2024
9cdb222
remove duplicates
MehmedGIT Feb 12, 2024
419a535
Make make assets in Dockerfile skipable
joschrew Mar 13, 2024
a55d961
Merge branch 'test-workflow' of github.com:OCR-D/core into test-workflow
MehmedGIT Apr 11, 2024
cb8cde7
merge master
MehmedGIT Apr 11, 2024
dfd78d5
make ocrd all tests callable from Makefile
MehmedGIT Apr 11, 2024
14576cf
update actions and add python 3.12
Apr 10, 2024
34459e0
update actions and add python 3.12
Apr 10, 2024
7d119aa
update actions
Apr 10, 2024
2a7ef7b
Remove ocrd_all-tests from core makefile
joschrew Apr 16, 2024
3effd63
ci: disable scrutinizer build
kba Apr 16, 2024
8dae53d
bashlib input-files: apply download_file on each input_file
bertsky Apr 25, 2024
0195099
bashlib input-files: let None pass through
bertsky Apr 25, 2024
feee374
scrutinizer: try to fix py version
bertsky Apr 25, 2024
48d52e3
:memo: changelog
kba May 3, 2024
71ec3a2
Merge branch 'master' into update/workflows
kba May 3, 2024
c8f41a5
drop distutils, support python 3.12
kba May 3, 2024
b788b59
:memo: changelog
kba May 3, 2024
df77ace
Merge branch 'master' into update/workflows
kba May 3, 2024
cf4664a
disable ocrd all test in core
MehmedGIT May 3, 2024
e88d646
:memo: changelog
kba May 3, 2024
6ecbaa8
make network-integration-test: disable ocrd_all test
kba May 3, 2024
f714742
Merge branch 'master' into test-workflow
kba May 3, 2024
1bd8fc4
ci: fix integration test
kba May 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ jobs:
contents: read

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- # Activate cache export feature to reduce build time of images
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v3
- name: Build the Docker image
# default tag uses docker.io, so override on command-line
run: make docker DOCKER_TAG=${{ env.GHCRIO_DOCKER_TAG }}
Expand All @@ -34,13 +34,13 @@ jobs:
docker run --rm ${{ env.GHCRIO_DOCKER_TAG }} ocrd --version
docker run --rm ${{ env.GHCRIO_DOCKER_TAG }}-cuda ocrd --version
- name: Login to GitHub Container Registry
uses: docker/login-action@v2
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Log in to Docker Hub
uses: docker/login-action@f4ef78c080cd8ba55a85445d5b36e214a81df20a
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERIO_USERNAME }}
password: ${{ secrets.DOCKERIO_PASSWORD }}
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/network-testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,18 @@ jobs:
- '3.9'
- '3.10'
- '3.11'
- '3.12'
os:
- ubuntu-22.04
# - macos-latest

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Homebrew
id: set-up-homebrew
uses: Homebrew/actions/setup-homebrew@master
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
Expand Down
7 changes: 4 additions & 3 deletions .github/workflows/unit-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,18 +22,19 @@ jobs:
- '3.9'
- '3.10'
- '3.11'
- '3.12'
os:
- ubuntu-22.04
- ubuntu-20.04
# - macos-latest
- macos-latest

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Homebrew
id: set-up-homebrew
uses: Homebrew/actions/setup-homebrew@master
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
Expand Down
4 changes: 4 additions & 0 deletions .scrutinizer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ checks:

build:
image: default-bionic
environment:
python:
version: 3.8.2
virtualenv: true
nodes:
analysis:
dependencies:
Expand Down
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@ Versioned according to [Semantic Versioning](http://semver.org/).

## Unreleased

Fixed:

- bashlib processors will download on-demand, like pythonic processors do, #1216, #1217

Changed:

- Replace `distutils` which equivalents from `shutil` for compatibility with python 3.12+, #1219
- CI: Updated GitHub actions, #1206
- CI: Fixed scrutinizer, #1217

## [2.64.1] - 2024-04-22

Fixed:
Expand Down
4 changes: 3 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@ WORKDIR /data
CMD ["/usr/local/bin/ocrd", "--help"]

FROM ocrd_core_base as ocrd_core_test
# Optionally skip make assets with this arg
ARG SKIP_ASSETS
WORKDIR /build-ocrd
COPY Makefile .
RUN make assets
RUN if test -z "$SKIP_ASSETS" || test $SKIP_ASSETS -eq 0 ; then make assets ; fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect SKIP_ASSETS=0 to disable the behavior. Did you mean test $SKIP_ASSETS -eq 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SKIP_ASSETS should be renamed to MAKE_ASSETS to reverse the logic.

SKIP_ASSETS itself has a negative meaning. False to False makes True.

SKIP_ASSETS=1 (true) -> do not make assets
SKIP_ASSETS=0 (false) -> make assets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for me 0 is false so if SKIP_ASSETS is 0 then it is not skipped.

COPY tests ./tests
COPY .gitmodules .
COPY requirements_test.txt .
Expand Down
10 changes: 5 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ install: #build
$(PIP) config set global.no-binary shapely

# Install with pip install -e
install-dev: PIP_INSTALL = $(PIP) install -e
install-dev: PIP_INSTALL = $(PIP) install -e
install-dev: PIP_INSTALL_CONFIG_OPTION = --config-settings editable_mode=strict
install-dev: uninstall
$(MAKE) install
Expand Down Expand Up @@ -240,12 +240,12 @@ network-module-test: assets
INTEGRATION_TEST_IN_DOCKER = docker exec core_test
network-integration-test:
$(DOCKER_COMPOSE) --file tests/network/docker-compose.yml up -d
-$(INTEGRATION_TEST_IN_DOCKER) pytest -k 'test_integration_' -v
-$(INTEGRATION_TEST_IN_DOCKER) pytest -k 'test_integration_' -v --ignore-glob="$(TESTDIR)/network/*ocrd_all*.py"
$(DOCKER_COMPOSE) --file tests/network/docker-compose.yml down --remove-orphans

network-integration-test-cicd:
$(DOCKER_COMPOSE) --file tests/network/docker-compose.yml up -d
$(INTEGRATION_TEST_IN_DOCKER) pytest -k 'test_integration_' -v
$(INTEGRATION_TEST_IN_DOCKER) pytest -k 'test_integration_' -v --ignore-glob="tests/network/*ocrd_all*.py"
$(DOCKER_COMPOSE) --file tests/network/docker-compose.yml down --remove-orphans

benchmark:
Expand Down Expand Up @@ -315,7 +315,7 @@ pyclean:
.PHONY: docker docker-cuda

# Additional arguments to docker build. Default: '$(DOCKER_ARGS)'
DOCKER_ARGS =
DOCKER_ARGS =

# Build docker image
docker: DOCKER_BASE_IMAGE = ubuntu:20.04
Expand All @@ -328,7 +328,7 @@ docker-cuda: DOCKER_FILE = Dockerfile.cuda

docker-cuda: docker

docker docker-cuda:
docker docker-cuda:
docker build --progress=plain -f $(DOCKER_FILE) -t $(DOCKER_TAG) --target ocrd_core_base --build-arg BASE_IMAGE=$(DOCKER_BASE_IMAGE) $(DOCKER_ARGS) .

# Build wheels and source dist and twine upload them
Expand Down
4 changes: 4 additions & 0 deletions src/ocrd/cli/bashlib.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,10 @@ def bashlib_input_files(**kwargs):
input_file_grp=kwargs['input_file_grp'],
output_file_grp=kwargs['output_file_grp'])
for input_files in processor.zip_input_files(mimetype=None, on_error='abort'):
# ensure all input files exist locally (without persisting them in the METS)
# - this mimics the default behaviour of all Pythonic processors
input_files = [workspace.download_file(input_file) if input_file else None
for input_file in input_files]
for field in ['url', 'local_filename', 'ID', 'mimetype', 'pageId']:
# make this bash-friendly (show initialization for associative array)
if len(input_files) > 1:
Expand Down
2 changes: 1 addition & 1 deletion src/ocrd/cli/resmgr.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"""
import sys
from pathlib import Path
from distutils.spawn import find_executable as which
from shutil import which
from yaml import safe_load, safe_dump

import requests
Expand Down
3 changes: 1 addition & 2 deletions src/ocrd/task_sequence.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import json
from shlex import split as shlex_split
from distutils.spawn import find_executable as which # pylint: disable=import-error,no-name-in-module
from subprocess import run, PIPE
from shutil import which

from ocrd_utils import getLogger, parse_json_string_or_file, set_json_key_value_overrides, get_ocrd_tool_json
# from collections import Counter
Expand Down
5 changes: 2 additions & 3 deletions src/ocrd/workspace_bagger.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,12 @@
from os import makedirs, chdir, walk
from os.path import join, isdir, basename as os_path_basename, exists, relpath
from pathlib import Path
from shutil import make_archive, rmtree, copyfile, move
from shutil import make_archive, rmtree, copyfile, move, copytree
from tempfile import mkdtemp, TemporaryDirectory
import re
import tempfile
import sys
from bagit import Bag, make_manifests, _load_tag_file, _make_tag_file, _make_tagmanifest_file # pylint: disable=no-name-in-module
from distutils.dir_util import copy_tree

from ocrd_utils import (
pushd_popd,
Expand Down Expand Up @@ -298,7 +297,7 @@ def recreate_checksums(self, src, dest=None, overwrite=False):
raise FileNotFoundError(f"data directory of bag not found at {src}")
if not overwrite:
path_to_bag.mkdir(parents=True, exist_ok=True)
copy_tree(src, dest)
copytree(src, dest, dirs_exist_ok=True)

with pushd_popd(path_to_bag):
n_bytes, n_files = make_manifests("data", 1, ["sha512"])
Expand Down
2 changes: 1 addition & 1 deletion src/ocrd_models/ocrd_exif.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from math import sqrt
from io import BytesIO
from subprocess import run, PIPE
from distutils.spawn import find_executable as which
from shutil import which
from ocrd_utils import getLogger

class OcrdExif():
Expand Down
4 changes: 2 additions & 2 deletions src/ocrd_utils/os.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@
from tempfile import TemporaryDirectory, gettempdir
from functools import lru_cache
from contextlib import contextmanager, redirect_stderr, redirect_stdout
from distutils.spawn import find_executable as which
from shutil import which
from json import loads
from json.decoder import JSONDecodeError
from os import getcwd, chdir, stat, chmod, umask, environ
from pathlib import Path
from os.path import exists, abspath as abspath_, join, isdir
from os.path import abspath as abspath_, join
from zipfile import ZipFile
from subprocess import run, PIPE
from mimetypes import guess_type as mimetypes_guess
Expand Down
2 changes: 1 addition & 1 deletion tests/data/wf_testcase.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def setUp(self):
p.chmod(0o777)

os.environ['PATH'] = os.pathsep.join([self.tempdir, os.environ['PATH']])
# from distutils.spawn import find_executable as which # pylint: disable=import-error,no-name-in-module
# from shutil import which # pylint: disable=import-error,no-name-in-module
# self.assertTrue(which('ocrd-sample-processor'))


45 changes: 4 additions & 41 deletions tests/network/test_integration_5_processing_server.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,14 @@
from pathlib import Path
from requests import get as request_get, post as request_post
from time import sleep
from requests import get as request_get
from src.ocrd_network.constants import AgentType, JobState
from src.ocrd_network.logging_utils import get_processing_job_logging_file_path
from tests.base import assets
from tests.network.config import test_config
from tests.network.utils import poll_till_timeout_fail_or_success, post_ps_processing_request, post_ps_workflow_request

PROCESSING_SERVER_URL = test_config.PROCESSING_SERVER_URL


def poll_till_timeout_fail_or_success(test_url: str, tries: int, wait: int) -> JobState:
job_state = JobState.unset
while tries > 0:
sleep(wait)
response = request_get(url=test_url)
assert response.status_code == 200, f"Processing server: {test_url}, {response.status_code}"
job_state = response.json()["state"]
if job_state == JobState.success or job_state == JobState.failed:
break
tries -= 1
return job_state


def test_processing_server_connectivity():
test_url = f"{PROCESSING_SERVER_URL}/"
response = request_get(test_url)
Expand Down Expand Up @@ -53,18 +40,7 @@ def test_processing_server_processing_request():
"parameters": {}
}
test_processor = "ocrd-dummy"
test_url = f"{PROCESSING_SERVER_URL}/processor/run/{test_processor}"
response = request_post(
url=test_url,
headers={"accept": "application/json"},
json=test_processing_job_input
)
print(response.json())
print(response.__dict__)
assert response.status_code == 200, f"Processing server: {test_url}, {response.status_code}"
processing_job_id = response.json()["job_id"]
assert processing_job_id

processing_job_id = post_ps_processing_request(PROCESSING_SERVER_URL, test_processor, test_processing_job_input)
job_state = poll_till_timeout_fail_or_success(
test_url=f"{PROCESSING_SERVER_URL}/processor/job/{processing_job_id}", tries=10, wait=10
)
Expand All @@ -81,20 +57,7 @@ def test_processing_server_workflow_request():
path_to_dummy_wf = "/ocrd-data/assets/dummy-workflow.txt"
workspace_root = "kant_aufklaerung_1784/data"
path_to_mets = assets.path_to(f"{workspace_root}/mets.xml")

# submit the workflow job
test_url = f"{PROCESSING_SERVER_URL}/workflow/run?mets_path={path_to_mets}&page_wise=True"
response = request_post(
url=test_url,
headers={"accept": "application/json"},
files={"workflow": open(path_to_dummy_wf, 'rb')}
)
# print(response.json())
# print(response.__dict__)
assert response.status_code == 200, f"Processing server: {test_url}, {response.status_code}"
wf_job_id = response.json()["job_id"]
assert wf_job_id

wf_job_id = post_ps_workflow_request(PROCESSING_SERVER_URL, path_to_dummy_wf, path_to_mets)
job_state = poll_till_timeout_fail_or_success(
test_url=f"{PROCESSING_SERVER_URL}/workflow/job-simple/{wf_job_id}", tries=30, wait=10
)
Expand Down
19 changes: 19 additions & 0 deletions tests/network/test_integration_ocrd_all.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from src.ocrd_network.constants import JobState
from tests.network.config import test_config
from tests.network.utils import poll_till_timeout_fail_or_success, post_ps_workflow_request

PROCESSING_SERVER_URL = test_config.PROCESSING_SERVER_URL


def test_ocrd_all_workflow():
# This test is supposed to run with ocrd_all not with just core on its own
# Note: the used workflow path is volume mapped
path_to_wf = "/ocrd-data/assets/ocrd_all-test-workflow.txt"
path_to_mets = "/data/mets.xml"
wf_job_id = post_ps_workflow_request(PROCESSING_SERVER_URL, path_to_wf, path_to_mets)
job_state = poll_till_timeout_fail_or_success(
test_url=f"{PROCESSING_SERVER_URL}/workflow/job-simple/{wf_job_id}",
tries=30,
wait=10
)
assert job_state == JobState.success
47 changes: 47 additions & 0 deletions tests/network/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from requests import get as request_get, post as request_post
from time import sleep
from src.ocrd_network.constants import JobState


def poll_till_timeout_fail_or_success(test_url: str, tries: int, wait: int) -> JobState:
job_state = JobState.unset
while tries > 0:
sleep(wait)
response = request_get(url=test_url)
assert response.status_code == 200, f"Processing server: {test_url}, {response.status_code}"
job_state = response.json()["state"]
if job_state == JobState.success or job_state == JobState.failed:
break
tries -= 1
return job_state


def post_ps_processing_request(ps_server_host: str, test_processor: str, test_job_input: dict) -> str:
test_url = f"{ps_server_host}/processor/run/{test_processor}"
response = request_post(
url=test_url,
headers={"accept": "application/json"},
json=test_job_input
)
# print(response.json())
# print(response.__dict__)
assert response.status_code == 200, f"Processing server: {test_url}, {response.status_code}"
processing_job_id = response.json()["job_id"]
assert processing_job_id
return processing_job_id


# TODO: Can be extended to include other parameters such as page_wise
def post_ps_workflow_request(ps_server_host: str, path_to_test_wf: str, path_to_test_mets: str) -> str:
test_url = f"{ps_server_host}/workflow/run?mets_path={path_to_test_mets}&page_wise=True"
response = request_post(
url=test_url,
headers={"accept": "application/json"},
files={"workflow": open(path_to_test_wf, "rb")}
)
# print(response.json())
# print(response.__dict__)
assert response.status_code == 200, f"Processing server: {test_url}, {response.status_code}"
wf_job_id = response.json()["job_id"]
assert wf_job_id
return wf_job_id
Loading