Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APEx benchmarks: add regression test #5

Closed
JeroenVerstraelen opened this issue Jun 17, 2024 · 8 comments
Closed

APEx benchmarks: add regression test #5

JeroenVerstraelen opened this issue Jun 17, 2024 · 8 comments
Assignees

Comments

@JeroenVerstraelen
Copy link

JeroenVerstraelen commented Jun 17, 2024

Add regression tests to the initial benchmarks.

  • To generate reference data, run benchmark once
  • Upload reference data to s3 object storage
    • What endpoint/bucket?
  • From then on every benchmark run should also compare its output to the reference data
  • Alerting when output does not match?
    • For the UDP owner, openEO team?
    • How to configure?
  • Save comparison result (Success, failure) to s3 object storage
    • So we can see a timeseries for each benchmark
    • Which endpoint/bucket?

Requirements:

  • Should be easy to update reference data.
  • Component that we can reuse in our own integrationtests.
  • Mostly via configuration, try to avoid having to write custom code for every scenario.
@soxofaan soxofaan changed the title add regression test APEx benchmarks: add regression test Jun 21, 2024
@soxofaan
Copy link
Contributor

Component that we can reuse in our own integrationtests.

I started with reusable job result comparison utilities in the python client at Open-EO/openeo-python-client#587

@soxofaan
Copy link
Contributor

soxofaan commented Jul 15, 2024

merged Open-EO/openeo-python-client#587 now

@JeroenVerstraelen
Copy link
Author

Sync with Thomas for s3 buckets.

@soxofaan
Copy link
Contributor

got s3 bucket info, and set up initial poc for comparing job results with reference data: bf11993

soxofaan added a commit that referenced this issue Jul 16, 2024
@soxofaan
Copy link
Contributor

First (intendedly) failing benchmark using reference data on S3 and brand new (unreleased) assert_job_results_allclose functionality of python client:
https://github.com/ESA-APEx/apex_algorithms/actions/runs/9961442215/job/27522945120

E           AssertionError: Issues for file 'openEO.tif':
E           Coordinates mismatch for dimension 'x': [295355. 295365. 295375. 295385. 295395. 295405. 295415. 295425. 295435.
E            295445. 295455. 295465. 295475. 295485. 295495. 295505. 295515. 295525.
...
E            296705. 296715. 296725. 296735. 296745. 296755. 296765. 296775. 296785.
E            296795. 296805. 296815. 296825. 296835.] != [644525. 644535. 644545. 644555. 644565. 644575. 644585. 644595. 644605.
E            644615. 644625. 644635. 644645. 644655. 644665. 644675. 644685. 644695.
...
E            645785. 645795. 645805. 645815. 645825. 645835. 645845. 645855. 645865.
E            645875. 645885. 645895. 645905. 645915. 645925. 645935. 645945. 645955.
E            645965. 645975. 645985.]
E           Coordinates mismatch for dimension 'y': [5679485. 5679475. 5679465. 5679455. 5679445. 5679435. 5679425. 5679415.
E            5679405. 5679395. 5679385. 5679375. 5679365. 5679355. 5679345. 5679335.
...
E            5677405. 5677395. 5677385. 5677375. 5677365. 5677355. 5677345. 5677335.
E            5677325. 5677315. 5677305. 5677295. 5677285. 5677275. 5677265. 5677255.
E            5677245. 5677235. 5677225. 5677215. 5677205.] != [5677475. 5677465. 5677455. 5677445. 5677435. 5677425. 5677415. 5677405.
E            5677395. 5677385. 5677375. 5677365. 5677355. 5677345. 5677335. 5677325.
E            5677315. 5677305. 5677295. 5677285. 5677275. 5677265. 5677255. 5677245.
...
E            5675395. 5675385. 5675375. 5675365. 5675355. 5675345. 5675335. 5675325.
E            5675315. 5675305. 5675295. 5675285. 5675275. 5675265. 5675255. 5675245.
E            5675235. 5675225. 5675215.]
E           Shape mismatch: (1, 229, 149) != (1, 227, 147)
E           Issues for metadata file 'job-results.json':
E           Differing 'derived_from' links (0 common, 24 only in actual, 24 only in expected):
E             only in actual: {'/eodata/Sentinel-2/MSI/L2A/2023/08/20/S2A_MSIL2A_20230820T1...
E             only in expected: {'/eodata/Sentinel-2/MSI/L2A/2023/09/17/S2B_MSIL2A_20230917T1....

soxofaan added a commit that referenced this issue Jul 23, 2024
soxofaan added a commit that referenced this issue Jul 25, 2024
soxofaan added a commit that referenced this issue Jul 26, 2024
soxofaan added a commit that referenced this issue Jul 26, 2024
soxofaan added a commit that referenced this issue Jul 26, 2024
Let defaults do their thing
soxofaan added a commit that referenced this issue Jul 26, 2024
soxofaan added a commit that referenced this issue Jul 26, 2024
Other selection/filter mechanisms (e.g. standard `-k`) should come before final sampling (typically just 1 sample)
@soxofaan
Copy link
Contributor

merged #20 which added the functionality (pytest plugin) to automatically upload generated assets on a failed test

soxofaan added a commit that referenced this issue Aug 21, 2024
soxofaan added a commit that referenced this issue Aug 21, 2024
soxofaan added a commit that referenced this issue Aug 21, 2024
@soxofaan
Copy link
Contributor

added some docs too at https://github.com/ESA-APEx/apex_algorithms/blob/main/docs/benchmarking.md

I think that's enough scope for this ticket

soxofaan added a commit that referenced this issue Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants