Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download and/or processing time increases a lot with --dask-distributed-local-core-fraction #52

Open
sadamov opened this issue Jan 15, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@sadamov
Copy link

sadamov commented Jan 15, 2025

When creating zarr-archives with MDP, I got suspicious that there might be something odd going on when --dask-distributed-local-core-fraction is set. It took an unreasonably long time to download and preprocess the data. Here is a MRE to describe the issue:

I prepared a few files in the zip folder to manually download era5 data and preprocess locally with MDP, compared with end-to-end preprocessing with MDP. To reproduce the results simply cd into the downloaded unzipped directory and run ./test_runtime.sh after making the script executable.
archive.zip

  • era5_local.datastore.yaml: Configuration file for direct download and processing of ERA5 weather data from the local source.
  • era5.datastore.yaml: Configuration file for direct download and processing of ERA5 weather data from the remote source.
  • retrieve.py: Python script that handles downloading ERA5 data to local storage manually with xarray.
  • runtime_results.txt: Results file containing benchmark measurements comparing different processing methods and core fractions.
  • test_runtime.sh: Shell script that runs performance tests comparing direct download versus local processing with different CPU core utilization settings.

Here a snapshop of the results that show how different the local and remote approach are scaling.

Runtime Benchmark Results
========================
Test Results - 2025-01-15 12:34:19
Core Fraction: none
Direct download and processing: 33 seconds
Manual download: 30 seconds
Local data processing: 5 seconds
----------------------------------------
Test Results - 2025-01-15 12:36:31
Core Fraction: 0.1
Direct download and processing: 91 seconds
Manual download: 32 seconds
Local data processing: 9 seconds
----------------------------------------
Test Results - 2025-01-15 12:39:00
Core Fraction: 0.25
Direct download and processing: 102 seconds
Manual download: 32 seconds
Local data processing: 15 seconds
----------------------------------------
Test Results - 2025-01-15 12:41:27
Core Fraction: 0.5
Direct download and processing: 96 seconds
Manual download: 32 seconds
Local data processing: 19 seconds
----------------------------------------

Some information about the system I was using:

System: Linux
Kernel: 5.14.21-150500.55.65_13.0.74-cray_shasta_c_64k
Memory: 854Gi
OS: SUSE Linux Enterprise Server 15 SP5
Conda Env: mllam/* installed with pdm
@sadamov sadamov added the bug Something isn't working label Jan 15, 2025
@sadamov sadamov changed the title Download and/or processing time increases a lot with higher --dask-distributed-local-core-fraction Download and/or processing time increases a lot with --dask-distributed-local-core-fraction Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants