APEx benchmarks: add regression test #5

JeroenVerstraelen · 2024-06-17T13:25:57Z

Add regression tests to the initial benchmarks.

To generate reference data, run benchmark once
Upload reference data to s3 object storage
- What endpoint/bucket?
From then on every benchmark run should also compare its output to the reference data
Alerting when output does not match?
- For the UDP owner, openEO team?
- How to configure?
Save comparison result (Success, failure) to s3 object storage
- So we can see a timeseries for each benchmark
- Which endpoint/bucket?

Requirements:

Should be easy to update reference data.
Component that we can reuse in our own integrationtests.
Mostly via configuration, try to avoid having to write custom code for every scenario.

soxofaan · 2024-07-15T07:34:36Z

Component that we can reuse in our own integrationtests.

I started with reusable job result comparison utilities in the python client at Open-EO/openeo-python-client#587

soxofaan · 2024-07-15T17:36:08Z

merged Open-EO/openeo-python-client#587 now

JeroenVerstraelen · 2024-07-16T09:05:55Z

Sync with Thomas for s3 buckets.

…nce data

soxofaan · 2024-07-16T17:23:58Z

got s3 bucket info, and set up initial poc for comparing job results with reference data: bf11993

… reference data

soxofaan · 2024-07-17T07:34:33Z

First (intendedly) failing benchmark using reference data on S3 and brand new (unreleased) assert_job_results_allclose functionality of python client:
https://github.com/ESA-APEx/apex_algorithms/actions/runs/9961442215/job/27522945120

E           AssertionError: Issues for file 'openEO.tif':
E           Coordinates mismatch for dimension 'x': [295355. 295365. 295375. 295385. 295395. 295405. 295415. 295425. 295435.
E            295445. 295455. 295465. 295475. 295485. 295495. 295505. 295515. 295525.
...
E            296705. 296715. 296725. 296735. 296745. 296755. 296765. 296775. 296785.
E            296795. 296805. 296815. 296825. 296835.] != [644525. 644535. 644545. 644555. 644565. 644575. 644585. 644595. 644605.
E            644615. 644625. 644635. 644645. 644655. 644665. 644675. 644685. 644695.
...
E            645785. 645795. 645805. 645815. 645825. 645835. 645845. 645855. 645865.
E            645875. 645885. 645895. 645905. 645915. 645925. 645935. 645945. 645955.
E            645965. 645975. 645985.]
E           Coordinates mismatch for dimension 'y': [5679485. 5679475. 5679465. 5679455. 5679445. 5679435. 5679425. 5679415.
E            5679405. 5679395. 5679385. 5679375. 5679365. 5679355. 5679345. 5679335.
...
E            5677405. 5677395. 5677385. 5677375. 5677365. 5677355. 5677345. 5677335.
E            5677325. 5677315. 5677305. 5677295. 5677285. 5677275. 5677265. 5677255.
E            5677245. 5677235. 5677225. 5677215. 5677205.] != [5677475. 5677465. 5677455. 5677445. 5677435. 5677425. 5677415. 5677405.
E            5677395. 5677385. 5677375. 5677365. 5677355. 5677345. 5677335. 5677325.
E            5677315. 5677305. 5677295. 5677285. 5677275. 5677265. 5677255. 5677245.
...
E            5675395. 5675385. 5675375. 5675365. 5675355. 5675345. 5675335. 5675325.
E            5675315. 5675305. 5675295. 5675285. 5675275. 5675265. 5675255. 5675245.
E            5675235. 5675225. 5675215.]
E           Shape mismatch: (1, 229, 149) != (1, 227, 147)
E           Issues for metadata file 'job-results.json':
E           Differing 'derived_from' links (0 common, 24 only in actual, 24 only in expected):
E             only in actual: {'/eodata/Sentinel-2/MSI/L2A/2023/08/20/S2A_MSIL2A_20230820T1...
E             only in expected: {'/eodata/Sentinel-2/MSI/L2A/2023/09/17/S2B_MSIL2A_20230917T1....

…to JSON

…w module

…#5/#7)

But skipped by default

Let defaults do their thing

…#5/#7)

…ver #5

Other selection/filter mechanisms (e.g. standard `-k`) should come before final sampling (typically just 1 sample)

soxofaan · 2024-07-26T15:59:24Z

merged #20 which added the functionality (pytest plugin) to automatically upload generated assets on a failed test

soxofaan · 2024-08-21T16:34:37Z

added some docs too at https://github.com/ESA-APEx/apex_algorithms/blob/main/docs/benchmarking.md

I think that's enough scope for this ticket

related to #5, #60

JeroenVerstraelen assigned soxofaan Jun 17, 2024

soxofaan changed the title ~~add regression test~~ APEx benchmarks: add regression test Jun 21, 2024

soxofaan mentioned this issue Jul 1, 2024

Set up initial benchmark for worldcereal #4

Closed

soxofaan mentioned this issue Jul 15, 2024

Add testing utilities Open-EO/openeo-python-client#587

Closed

soxofaan added a commit that referenced this issue Jul 16, 2024

Issue #5: benchmarks: initial implementation of comparing with refere…

bf11993

…nce data

soxofaan added a commit that referenced this issue Jul 16, 2024

fixup! Issue #5: benchmarks: initial implementation of comparing with…

bee4346

… reference data

soxofaan added a commit that referenced this issue Jul 17, 2024

Issue #5 finetune max_ndvi_fail_intentionally benchmark a bit

54c9080

soxofaan mentioned this issue Jul 19, 2024

introduce openeo.testing.results with reusable result comparison utilities for test suites Open-EO/openeo-python-client#594

Closed

soxofaan added a commit that referenced this issue Jul 19, 2024

Issue #5 create test report and use basic GA upload

0717fee

soxofaan added a commit that referenced this issue Jul 19, 2024

fixup! Issue #5 create test report and use basic GA upload

229a3fd

soxofaan added a commit that referenced this issue Jul 19, 2024

fixup! fixup! Issue #5 create test report and use basic GA upload

0155344

soxofaan added a commit that referenced this issue Jul 19, 2024

fixup! fixup! fixup! Issue #5 create test report and use basic GA upload

eec7446

soxofaan added a commit that referenced this issue Jul 19, 2024

Issue #5/#7 initial pytest plugin to collect metrics and dump to JSON

5b27784

soxofaan mentioned this issue Jul 22, 2024

HTML+JSON reporting #19

Merged

soxofaan added a commit that referenced this issue Jul 22, 2024

fixup! Issue #5/#7 initial pytest plugin to collect metrics and dump …

97ae285

…to JSON

soxofaan added a commit that referenced this issue Jul 22, 2024

Issue #5 create test report and use basic GA upload

56dbad6

soxofaan added a commit that referenced this issue Jul 22, 2024

Issue #5/#7 initial pytest plugin to collect metrics and dump to JSON

10d431a

soxofaan added a commit that referenced this issue Jul 22, 2024

Issue #5/#7 refactor metrics plugin from conftest to onw module

d9e38e6

soxofaan added a commit that referenced this issue Jul 22, 2024

fixup! Issue #5/#7 refactor metrics plugin from conftest to onw module

b7bead3

soxofaan added a commit that referenced this issue Jul 22, 2024

fixup! fixup! Issue #5/#7 refactor metrics plugin from conftest to on…

0d3b47e

…w module

soxofaan added a commit that referenced this issue Jul 22, 2024

Issue #5/#7 refactor metrics plugin from conftest to onw module

ed6e218

soxofaan added a commit that referenced this issue Jul 22, 2024

Issue #5/#7 rename plugin to "track_metrics"

c734d8b

soxofaan added a commit that referenced this issue Jul 23, 2024

Issue #5/#7 add basic test for track_metrics plugin

6cddd4f

soxofaan added a commit that referenced this issue Jul 23, 2024

Issue #5/#7 remove test dummy again

42de3ac

soxofaan added a commit that referenced this issue Jul 24, 2024

Issue #5 use --basetemp to have controlled output folder

5f8ace3

soxofaan added a commit that referenced this issue Jul 25, 2024

Finetune/synchronize configuration of track_metrics/upload_assets (…

2440c1c

…#5/#7)

soxofaan added a commit that referenced this issue Jul 25, 2024

upload_assets refactor to something simpler and add initial tests (#5)

df4fd8f

soxofaan added a commit that referenced this issue Jul 25, 2024

upload_assets update TODO note about always-upload feature #5 #22

586bc8d

soxofaan added a commit that referenced this issue Jul 25, 2024

upload_assets refactor to something simpler and add initial tests (#5)

c263a0d

soxofaan added a commit that referenced this issue Jul 25, 2024

upload_assets update TODO note about always-upload feature #5 #22

481b07a

soxofaan added a commit that referenced this issue Jul 25, 2024

Run benchmarks with log level INFO #5 #7

0922337

soxofaan added a commit that referenced this issue Jul 25, 2024

upload_assets report uploads with terminalreported for now #5

f3e8f38

soxofaan added a commit that referenced this issue Jul 26, 2024

Benchmarks: re-add dummy test module (#5)

de1564b

But skipped by default

soxofaan added a commit that referenced this issue Jul 26, 2024

Benchmarks: develop in dummy mode (to be reverted) #5

2d39be7

soxofaan added a commit that referenced this issue Jul 26, 2024

Issue #5 use --basetemp to have controlled output folder

b1bb730

soxofaan added a commit that referenced this issue Jul 26, 2024

Issue #5 initial pytest plugin impl to upload results on failure to s3

101d4ce

soxofaan added a commit that referenced this issue Jul 26, 2024

Issue #5 apply upload_assets on real benchmark

14c7444

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: add option for run id (#5)

f20e9b7

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: upload with "public-read" ACL (#5)

c013efa

soxofaan added a commit that referenced this issue Jul 26, 2024

Issue #5/#7 simplify pytest_addoption handling a bit

b8e5273

Let defaults do their thing

soxofaan added a commit that referenced this issue Jul 26, 2024

Finetune/synchronize configuration of track_metrics/upload_assets (…

4990361

…#5/#7)

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: refactor to something simpler and add initial tests (#5)

e3160f2

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: update TODO note about always-upload feature #5 #22

2fb14cd

soxofaan added a commit that referenced this issue Jul 26, 2024

Run benchmarks with log level INFO #5 #7

89de8d0

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: report uploads with terminalreporter for now #5

c573a9b

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets tests: automatically pick free port for ThreadedMotoSer…

80f6e27

…ver #5

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: run random-subset filtering as late as possible #5

5754952

Other selection/filter mechanisms (e.g. standard `-k`) should come before final sampling (typically just 1 sample)

soxofaan added a commit that referenced this issue Jul 26, 2024

upload_assets: remove debug config from benchmarks.yaml #5

f673cc9

soxofaan added a commit that referenced this issue Aug 21, 2024

Issue #5/#7 some initial docs

9ab9b4d

soxofaan added a commit that referenced this issue Aug 21, 2024

Issue #5/#7 some initial docs

107364e

soxofaan added a commit that referenced this issue Aug 21, 2024

Issue #5/#7 some initial docs

59b56a6

soxofaan closed this as completed Aug 21, 2024

soxofaan added a commit that referenced this issue Nov 8, 2024

Update benchmark scenario schema to recent changes

f80f6ec

related to #5, #60

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

APEx benchmarks: add regression test #5

APEx benchmarks: add regression test #5

JeroenVerstraelen commented Jun 17, 2024 •

edited

Loading

soxofaan commented Jul 15, 2024

soxofaan commented Jul 15, 2024 •

edited

Loading

JeroenVerstraelen commented Jul 16, 2024

soxofaan commented Jul 16, 2024

soxofaan commented Jul 17, 2024

soxofaan commented Jul 26, 2024

soxofaan commented Aug 21, 2024

APEx benchmarks: add regression test #5

APEx benchmarks: add regression test #5

Comments

JeroenVerstraelen commented Jun 17, 2024 • edited Loading

soxofaan commented Jul 15, 2024

soxofaan commented Jul 15, 2024 • edited Loading

JeroenVerstraelen commented Jul 16, 2024

soxofaan commented Jul 16, 2024

soxofaan commented Jul 17, 2024

soxofaan commented Jul 26, 2024

soxofaan commented Aug 21, 2024

JeroenVerstraelen commented Jun 17, 2024 •

edited

Loading

soxofaan commented Jul 15, 2024 •

edited

Loading