Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to save regularly tiled MeshArrays data to files as a collection of compressed chunks "a la zarr"? #166

Open
gaelforget opened this issue Jan 7, 2025 · 5 comments

Comments

@gaelforget
Copy link
Member

gaelforget commented Jan 7, 2025

@gaelforget gaelforget changed the title how to treat MITgcm tiled output as a collection "a la zarr"? how to save regularly tiled MeshArrays data to files as a collection of compressed chunks "a la zarr"? Jan 7, 2025
@gaelforget
Copy link
Member Author

using NCDatasets, Glob, Climatology
lst_E=glob("ETAN*nc",ScratchSpaces.ECCO*"/ETAN")
lst_M=glob("MXLDEPTH*nc",ScratchSpaces.ECCO*"/MXLDEPTH")
ds_E=NCDataset(lst_E,aggdim="tile",isnewdim=true)
ds_M=NCDataset(lst_M,aggdim="tile",isnewdim=true)
ds=merge(ds_E,ds_M)

@gaelforget
Copy link
Member Author

gaelforget commented Jan 8, 2025

Not sure how to convert #166 (comment) to Rasters.jl

The following returns an error, but it's unclear how to resolve it based on the error message or the docs of Rasters.jl

This works :

using Rasters, Glob, Climatology, NCDatasets
get_ecco_variable_if_needed("ETAN")
lst_E=glob("ETAN*nc",ScratchSpaces.ECCO*"/ETAN")
RasterStack(lst_E[1])

and returns :

┌ 90×90×12 RasterStack ┐
├──────────────────────┴────────────────────────────────────────────────────────────────────── dims ┐
  ↓ i3 Sampled{Float64} [1.0, 2.0, …, 89.0, 90.0] ForwardOrdered Regular Points,
  → i2 Sampled{Float64} [1.0, 2.0, …, 89.0, 90.0] ForwardOrdered Regular Points,
  ↗ i1 Sampled{Float64} [1.0, 2.0, …, 11.0, 12.0] ForwardOrdered Regular Points
├─────────────────────────────────────────────────────────────────────────────────────────── layers ┤
  :area    eltype: Float64 dims: i3, i2 size: 90×90
  :land    eltype: Float64 dims: i3, i2 size: 90×90
  :lat     eltype: Float64 dims: i3, i2 size: 90×90
  :lon     eltype: Float64 dims: i3, i2 size: 90×90
  :tim     eltype: Dates.DateTime dims: i1 size: 12
  :timstep eltype: Float64 dims: i1 size: 12
  :ETAN    eltype: Float32 dims: i3, i2, i1 size: 90×90×12
├───────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Metadata{Rasters.NCDsource} of Dict{String, Any} with 11 entries:
  "history"       => " files revision history :\n    2015/04/29 : convert main variable type to sing…
  "NCO"           => "4.3.7"
  "Format"        => "native grid (nctiles w. 13 tiles)"
  "references"    => "Forget, G., J.-M. Campin, P. Heimbach, C. N. Hill, R. M. Ponte, \n and C. Wuns…
  "date"          => "01-Apr-2016"
  "_FillValue"    => NaN
  "Conventions"   => "CF-1.6"
  "missing_value" => NaN
  "source"        => "ECCO consortium (http://ecco-group.org/)"
  "title"         => "ETAN -- ECCO v4 ocean state estimate, release 2 -- climatology"
  "institution"   => "MIT"
├─────────────────────────────────────────────────────────────────────────────────────────── raster ┤
  extent: Extent(i3 = (1.0, 90.0), i2 = (1.0, 90.0), i1 = (1.0, 12.0))
  missingval: missing
└───────────────────────────────────────────────────────────────────────────────────────────────────┘

but none of these work :


RasterStack(dirname(lst_E[1]))

returns the error :

ERROR: NetCDF error: Variable 'ETAN.0001' not found in file /Users/gaelforget/.julia/scratchspaces/9e9a4d37-2d2e-41e3-8b85-f7978328d9c7/ECCO/ETAN/ETAN.0001.nc (NetCDF error code: -49)
Stacktrace:
  [1] nc_inq_varid(ncid::Int32, name::Symbol)
    @ NCDatasets ~/.julia/packages/NCDatasets/LjXBn/src/netcdf_c.jl:1558
  [2] _variable
    @ ~/.julia/packages/NCDatasets/LjXBn/src/variable.jl:72 [inlined]
  [3] variable
    @ ~/.julia/packages/NCDatasets/LjXBn/src/variable.jl:84 [inlined]
  [4] cfvariable(ds::NCDataset{Nothing, Missing}, varname::Symbol)
    @ CommonDataModel ~/.julia/packages/CommonDataModel/vTYnt/src/cfvariable.jl:86
  [5] getindex
    @ ~/.julia/packages/CommonDataModel/vTYnt/src/dataset.jl:170 [inlined]
  [6] _open(f::Rasters.var"#61#62"{…}, ::Rasters.NCDsource, ds::NCDataset{…}; name::Symbol, group::Rasters.NoKW, kw::@Kwargs{})
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/sources/commondatamodel.jl:128
  [7] _open
    @ ~/.julia/packages/Rasters/C1Hvd/src/sources/commondatamodel.jl:126 [inlined]
  [8] Raster(ds::NCDataset{…}, filename::String; dims::Rasters.NoKW, refdims::Tuple{}, name::Symbol, group::Rasters.NoKW, metadata::Rasters.NoKW, missingval::Rasters.NoKW, crs::Rasters.NoKW, mappedcrs::Rasters.NoKW, source::Rasters.NCDsource, replace_missing::Bool, write::Bool, lazy::Bool, dropband::Bool, checkmem::Bool)
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/array.jl:324
  [9] Raster
    @ ~/.julia/packages/Rasters/C1Hvd/src/array.jl:306 [inlined]
 [10] #58
    @ ~/.julia/packages/Rasters/C1Hvd/src/array.jl:303 [inlined]
 [11] _open(f::Rasters.var"#58#59"{…}, ::Rasters.NCDsource, ds::NCDataset{…}; name::Rasters.NoKW, group::Nothing, kw::@Kwargs{})
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/sources/commondatamodel.jl:129
 [12] _open
    @ ~/.julia/packages/Rasters/C1Hvd/src/sources/commondatamodel.jl:126 [inlined]
 [13] #8
    @ ~/.julia/packages/Rasters/C1Hvd/ext/RastersNCDatasetsExt/ncdatasets_source.jl:67 [inlined]
 [14] NCDataset(::RastersNCDatasetsExt.var"#8#9"{…}, ::String, ::Vararg{…}; kwargs::@Kwargs{})
    @ NCDatasets ~/.julia/packages/NCDatasets/LjXBn/src/dataset.jl:252
 [15] NCDataset(::Function, ::String, ::Vararg{String})
    @ NCDatasets ~/.julia/packages/NCDatasets/LjXBn/src/dataset.jl:249
 [16] _open(f::Function, ::Rasters.NCDsource, filename::String; write::Bool, kw::@Kwargs{})
    @ RastersNCDatasetsExt ~/.julia/packages/Rasters/C1Hvd/ext/RastersNCDatasetsExt/ncdatasets_source.jl:66
 [17] _open
    @ ~/.julia/packages/Rasters/C1Hvd/ext/RastersNCDatasetsExt/ncdatasets_source.jl:63 [inlined]
 [18] #_open#21
    @ ~/.julia/packages/Rasters/C1Hvd/src/sources/sources.jl:86 [inlined]
 [19] _open
    @ ~/.julia/packages/Rasters/C1Hvd/src/sources/sources.jl:85 [inlined]
 [20] #Raster#57
    @ ~/.julia/packages/Rasters/C1Hvd/src/array.jl:302 [inlined]
 [21] (::Rasters.var"#96#99"{Rasters.NoKW, @Kwargs{…}})(name::Symbol, fn::String, mv::Rasters.NoKW, md::Rasters.NoKW)
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/stack.jl:365
 [22] map(::Function, ::NTuple{13, Symbol}, ::NTuple{13, String}, ::NTuple{13, Rasters.NoKW}, ::Vararg{NTuple{…}})
    @ Base ./tuple.jl:406
 [23] RasterStack(filenames::@NamedTuple{…}; source::Rasters.NoKW, missingval::Rasters.NoKW, metadata::Rasters.NoKW, resize::Rasters.NoKW, layermetadata::Rasters.NoKW, layerdims::Rasters.NoKW, kw::@Kwargs{…})
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/stack.jl:364
 [24] RasterStack(filenames::Vector{…}; name::Vector{…}, kw::@Kwargs{…})
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/stack.jl:351
 [25] RasterStack
    @ ~/.julia/packages/Rasters/C1Hvd/src/stack.jl:346 [inlined]
 [26] RasterStack(filename::String; lazy::Bool, dropband::Bool, replace_missing::Bool, source::Rasters.NoKW, name::Rasters.NoKW, group::Rasters.NoKW, kw::@Kwargs{})
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/stack.jl:392
 [27] RasterStack(filename::String)
    @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/stack.jl:370
 [28] top-level scope
    @ REPL[12]:1
Some type information was truncated. Use `show(err)` to see complete types.

And I get the same error with RasterStack(lst_E).


Then I tried the following

aggregate([Raster(f) for f in lst_E])
ERROR: MethodError: no method matching aggregate(::Vector{Raster{…}})
The function `aggregate` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  aggregate(::Any, ::DimensionalData.Dimensions.Lookups.Regular, ::Any)
   @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/methods/aggregate.jl:100
  aggregate(::Any, ::DimensionalData.Dimensions.Lookups.Span, ::Any)
   @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/methods/aggregate.jl:99
  aggregate(::Any, ::DimensionalData.Dimensions.Lookups.Lookup, ::Any)
   @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/methods/aggregate.jl:87

and

RasterSeries(lst_E)
ERROR: MethodError: no method matching RasterSeries(::Vector{String})
The type `RasterSeries` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  RasterSeries(::A, ::D, ::R) where {T, N, A<:AbstractArray{T, N}, D<:Tuple, R<:Tuple}
   @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/series.jl:127
  RasterSeries(::AbstractArray{<:Union{AbstractString, NamedTuple}}, ::Any; refdims, lazy, duplicate_first, child, resize, kw...)
   @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/series.jl:142
  RasterSeries(::DimensionalData.AbstractBasicDimArray; kw...)
   @ Rasters ~/.julia/packages/Rasters/C1Hvd/src/series.jl:132
  ...

ping @rafaqz

@rafaqz
Copy link

rafaqz commented Jan 8, 2025

Hard to say not knowing what the errors are or what is in the vector

@gaelforget
Copy link
Member Author

gaelforget commented Jan 8, 2025

Hard to say not knowing what the errors are or what is in the vector

I have added the error messages above. The ERROR: NetCDF error: Variable 'ETAN.0001' not found in file
is the one I want to fix really. The relevant variable isETANin all files notETAN.0001etc. The goal is to concatenate the various 90x90 tiles along a new dimension (which we calledtile` in #166 (comment)).

@rafaqz
Copy link

rafaqz commented Jan 8, 2025

For RasterSeries you need to pass some dimensions that those files are over as the second argument

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants