-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Selection of input data along time coordinate fails #68
Comments
@observingClouds I'll have a look at this next week. |
@observingClouds Could you help me reproduce the first error? I tried adding datetime as an accepted datatype of Range's start and end points, and then I wrote a test that the range should instantiate using datetime objects, and it passed. But it looks like it doesn't check the type. If I can reproduce the error I'll write a test that I can work from. |
Sure, you can write also the time without quotes: datetime type coord_ranges:
time:
start: 2022-04-10T00:00:00
end: 2022-04-11T00:00:00 vs. string type coord_ranges:
time:
start: "2022-04-10T00:00:00"
end: "2022-04-11T00:00:00" |
@observingClouds
Where I have
in one and
in the other, but I can't seem to reproduce your error, both tests pass.. |
Alright, here is the script that produced the error for me: import mllam_data_prep as mdp
import pytest
import yaml
import mllam_data_prep as mdp
with open("example.danra.yaml", "r") as file:
BASE_CONFIG = file.read()
HEIGHT_LEVEL_TEST_SECTION = """\
inputs:
danra_height_levels:
path: https://object-store.os-api.cci1.ecmwf.int/mllam-testdata/danra_cropped/v0.2.0/height_levels.zarr
dims: [time, x, y, altitude]
variables:
u:
altitude:
values: [100, 50,]
units: m
v:
altitude:
values: [100, 50, ]
units: m
dim_mapping:
time:
method: rename
dim: time
state_feature:
method: stack_variables_by_var_name
dims: [altitude]
name_format: "{var_name}{altitude}m"
grid_index:
method: stack
dims: [x, y]
coord_ranges:
time:
start: 2022-04-01T00:00:00
end: 2022-04-01T03:00:00
target_output_variable: state
"""
def update_config(config: str, update: str):
"""
Update provided config.
Parameters
----------
config: str
String with config in yaml format
update: str
String with the update in yaml format
Returns
-------
config: Config
Updated config
"""
original_config = mdp.Config.from_yaml(config)
update = yaml.safe_load(update)
modified_config = original_config.to_dict()
modified_config.update(update)
modified_config = mdp.Config.from_dict(modified_config)
return modified_config
config = update_config(BASE_CONFIG, HEIGHT_LEVEL_TEST_SECTION)
ds = mdp.create_dataset(config=config)
print(any(ds.isnull().any().compute().to_array()))
# nan_in_ds = any(ds.isnull().any().to_array()) |
Thanks @matschreiner for your great work in #55. I just tried this and run into an issue when doing a selection along the time dimension.
What I did
First I provided in my config file a start and end datetime, which resulted in:
Second, I tried providing the time as a string, but this resulted in
The second issue stems from
check_point_in_dataset()
which does not do time conversions, e.g.str
(provided in config) anddatetime
in dataset and therefore fails, even if the time is available:Also, is there a reason why we call
check_point_in_dataset()
only in case of a coordinate is namedtime
? Do we need this test at all? Isn't xarray raising already a good error message?What I expected
I expected both of my trials to be working.
The text was updated successfully, but these errors were encountered: