Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeseries.parquet: Unsupported job result #1031

Open
bossie opened this issue Jan 31, 2025 · 6 comments · Fixed by #1035
Open

timeseries.parquet: Unsupported job result #1031

bossie opened this issue Jan 31, 2025 · 6 comments · Fixed by #1035
Assignees

Comments

@bossie
Copy link
Collaborator

bossie commented Jan 31, 2025

Output asset timeseries.parquet exists but cannot be downloaded:

openeo.rest.OpenEoApiError: [500] Internal: Server error: Unsupported job result (ref: r-2501281524114129aba58c986bb5b40e)`

@bossie bossie changed the title trouble downloading timeseries.parquet timeseries.parquet: Unsupported job result Jan 31, 2025
@bossie bossie self-assigned this Jan 31, 2025
@bossie
Copy link
Collaborator Author

bossie commented Jan 31, 2025

Request:

Handling GET https://openeo.dataspace.copernicus.eu/openeo/1.2/jobs/j-2501281514504e28abdec3be400cc269/results/assets/.../timeseries.parquet?expires=1738682651 with data b''

Error:

Traceback (most recent call last):
  File "/opt/openeo/lib/python3.8/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/openeo/lib/python3.8/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/views.py", line 1510, in download_job_result_signed
    return _download_job_result(job_id=job_id, filename=filename, user_id=user_id)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/views.py", line 1282, in _download_job_result
    raise InternalException("Unsupported job result")
openeo_driver.errors.InternalException: Server error: Unsupported job result

The asset's href in job_metadata.json is this: /batch_jobs/j-2501281514504e28abdec3be400cc269/timeseries.parquet which is indeed not something that _download_job_result expects. It looks like it is missing this prefix: s3://OpenEO-data

@bossie
Copy link
Collaborator Author

bossie commented Jan 31, 2025

Another occurrence:

{"code": "Internal","id": "r-2501311206424f85b4770715141ae811","message": "Server error: Unsupported job result"}

Request:

Handling GET https://openeo-staging.dataspace.copernicus.eu/openeo/1.2/jobs/j-2501311126514e0fb011afe929ce6f70/results/assets/.../timeseries.csv?expires=1738929997 with data b''

The asset exists; job_metadata.json points to this: /batch_jobs/j-2501311126514e0fb011afe929ce6f70/timeseries.csv

@bossie
Copy link
Collaborator Author

bossie commented Jan 31, 2025

Replayed j-2501311126514e0fb011afe929ce6f70 as:

j-2501311307594c428c9bf6606e2a7517 (dev);
j-2501311323564a4eb5c2c7906411a3d9 (prod);
j-2501311338194788a750b1a0321b9ec1 (staging).

but all of their assets download just fine. 🤔 As expected, their hrefs look like this: s3://OpenEO-data/batch_jobs/j-2501311307594c428c9bf6606e2a7517/timeseries.csv

@bossie
Copy link
Collaborator Author

bossie commented Jan 31, 2025

openeo-geopyspark-driver version on prod: 0.59.0a1.dev20250123+2419 (corresponds to commit d019b4ee3b087d79baacc1672cad96459839c369).

@bossie
Copy link
Collaborator Author

bossie commented Feb 3, 2025

@kristofvt @jdries this problem does not happen for every job, right?

@bossie bossie transferred this issue from Open-EO/openeo-python-driver Feb 3, 2025
bossie added a commit that referenced this issue Feb 3, 2025
bossie added a commit that referenced this issue Feb 3, 2025
@jdries jdries assigned EmileSonneveld and unassigned bossie Feb 4, 2025
@bossie bossie linked a pull request Feb 4, 2025 that will close this issue
@bossie
Copy link
Collaborator Author

bossie commented Feb 4, 2025

@EmileSonneveld I've got a PR set up to avoid some unnecessary I/O on the mounted batch job directory but it's still unclear why the original problem happens; it does seem to have to do with the cluster being unstable.

bossie added a commit that referenced this issue Feb 4, 2025
* uploading temporary STAC files for export_workspace is unnecessary #1031

* avoid unnecessary writes of job_metadata.json #1031
@bossie bossie reopened this Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants