Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi file support #31

Open
aasdelat opened this issue Apr 12, 2024 · 5 comments
Open

Multi file support #31

aasdelat opened this issue Apr 12, 2024 · 5 comments

Comments

@aasdelat
Copy link

aasdelat commented Apr 12, 2024

It seems that NCDatasets has multi-file support, aggregating different nc files. But this feature seems to be missing in GRIBDatasets, and it would be very nice to have it. Are you planning to incorporate this feature?
Thanks a lot in advance.

@tcarion
Copy link
Member

tcarion commented Jul 3, 2024

Hi!

Indeed, this feature is not available in GRIBDatasets, and is not trivial to implement. Unfortunately, I don't have much time to work on this right know, but I can provide support if someone wants to get into it !

@Alexander-Barth
Copy link
Member

With JuliaGeo/CommonDataModel.jl@e630054 and GRIBDataset (4f13871), I got the following to work:

using CommonDataModel: MFDataset
using GRIBDatasets
using Test

fnames = ["/mnt/data1/abarth/.julia/packages/GRIB/6rlik/test/samples/regular_latlon_surface.grib2",
          "/mnt/data1/abarth/.julia/packages/GRIB/6rlik/test/samples/regular_latlon_surface.grib2"]

# open all grib data files
ds = GRIBDataset.(fnames)
# concatenate all files along the dimension named "valid_time"
mfds = MFDataset(ds,aggdim = "valid_time")

@test length(mfds["valid_time"]) == 2

It might also work with previous versions of GRIBDatasets, but I only tested for the current main version.

Note that I used twice the same file.
@aasdelat Does this work for you as well, maybe with a more interesting test case?

@aasdelat
Copy link
Author

aasdelat commented Oct 11, 2024

Sorry for my delay.
I tested it and got an error:

julia> ds = GRIBDataset.(input_files);
julia> mfds = MFDataset(ds,aggdim = "valid_time")
ERROR: MethodError: no method matching MFDataset(::Vector{GRIBDataset{Float64, 2, Missing}}; aggdim::String)

Closest candidates are:
  MFDataset(::Any, ::Any, ::Any, ::Any) got unsupported keyword argument "aggdim"
   @ CommonDataModel ~/.julia/packages/CommonDataModel/G3moc/src/multifile.jl:77
  MFDataset(::Array{T, N}, ::S, ::Bool, ::Vector{Symbol}, ::Dict{String, String}) where {T, N, S<:AbstractString} got unsupported keyword argument "aggdim"
   @ CommonDataModel ~/.julia/packages/CommonDataModel/G3moc/src/types.jl:125
  MFDataset(::Any, ::AbstractArray{<:AbstractString, N}; ...) where N
   @ CommonDataModel ~/.julia/packages/CommonDataModel/G3moc/src/multifile.jl:87
  ...

Stacktrace:
 [1] top-level scope
   @ REPL[13]:1

I have updated the libraries:

(@CDM) pkg> update NCDatasets
    Updating registry at `~/.julia/registries/General.toml`
  No Changes to `~/.julia/environments/CDM/Project.toml`
  No Changes to `~/.julia/environments/CDM/Manifest.toml`

(@CDM) pkg> update GRIBDatasets
    Updating registry at `~/.julia/registries/General.toml`
  No Changes to `~/.julia/environments/CDM/Project.toml`
  No Changes to `~/.julia/environments/CDM/Manifest.toml`
up
(@CDM) pkg> update CommonDataModel
    Updating registry at `~/.julia/registries/General.toml`
  No Changes to `~/.julia/environments/CDM/Project.toml`
  No Changes to `~/.julia/environments/CDM/Manifest.toml`

But with no luck.

@Alexander-Barth
Copy link
Member

The change in CommonDataModel is not jet in a released version. Can you try again by installing the current main version of CommonDataModel from github?

]add https://github.com/JuliaGeo/CommonDataModel.jl#main

Sorry if this was not clear.

@aasdelat
Copy link
Author

aasdelat commented Oct 15, 2024

It works!

I have tested it with grib files and also with NetCDF files (in this case, I have used "NCDataset()" instead of "GRIBDataset()"). And it works for both of them.

I think this feature should be integrated into the Rasters and YAXArrays libraries.

As a suggestion, perhaps it is appropriate that GRIBDataset() and NCDataset() are similar, so that, in the same manner that NCDataset() can load a list of files, so GRIBDataset() could do too. Additionally, it could be also useful to have one function in the CommonDataModel library that can load either gribs of NetCDFs. Its name could be CDataset(), or COMMONDataset(), or something like that.

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants