From 8f92a2b8205559b9f33e2bc80ea729cfeffd5342 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Thu, 24 Oct 2024 17:04:08 +0000 Subject: [PATCH] build based on e7ad295 --- dev/.documenter-siteinfo.json | 2 +- dev/attributes/index.html | 2 +- dev/dataset/index.html | 16 ++++++++-------- dev/dimensions/index.html | 8 ++++---- dev/index.html | 2 +- dev/issues/index.html | 2 +- dev/other/index.html | 2 +- dev/performance/index.html | 2 +- dev/tutorials/index.html | 2 +- dev/variables/index.html | 22 +++++++++++----------- 10 files changed, 30 insertions(+), 30 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 8deb0ea5..da74ae0f 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-10-24T16:50:21","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-10-24T17:04:02","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/dev/attributes/index.html b/dev/attributes/index.html index b8f3a9c8..74a8abc4 100644 --- a/dev/attributes/index.html +++ b/dev/attributes/index.html @@ -28,4 +28,4 @@ ))

Note that Dict does not perserve the order of the attributes. Therefore an OrderedDict from the package DataStructures is preferable.

Or one could use simply the fillvalue parameter of defVar.

ncv1 = defVar(ds,"v1", UInt8, ("longitude", "latitude", "time"), fillvalue = UInt8(255), attrib = [
     "add_offset"                => -1.0,
     "scale_factor"              => 5.0,
-])
+]) diff --git a/dev/dataset/index.html b/dev/dataset/index.html index e1159500..ced9f85a 100644 --- a/dev/dataset/index.html +++ b/dev/dataset/index.html @@ -14,7 +14,7 @@ resp = HTTP.get("https://www.unidata.ucar.edu/software/netcdf/examples/ECMWF_ERA-40_subset.nc") ds = NCDataset("some_string","r",memory = resp.body) total_precipitation = ds["tp"][:,:,:] -close(ds)

Dataset is an alias of NCDataset.

source
mfds = NCDataset(fnames, mode = "r"; aggdim = nothing, deferopen = true,
+close(ds)

Dataset is an alias of NCDataset.

source
mfds = NCDataset(fnames, mode = "r"; aggdim = nothing, deferopen = true,
               isnewdim = false,
               constvars = [])

Opens a multi-file dataset in read-only "r" or append mode "a". fnames is a vector of file names.

Variables are aggregated over the first unlimited dimension or over the dimension aggdim if specified. Variables without the dimensions aggdim are not aggregated. All variables containing the dimension aggdim are aggregated. The variable who do not contain the dimension aggdim are assumed constant.

If variables should be aggregated over a new dimension (not present in the NetCDF file), one should set isnewdim to true. All NetCDF files should have the same variables, attributes and groupes. Per default, all variables will have an additional dimension unless they are marked as constant using the constvars parameter.

The append mode is only implemented when deferopen is false. If deferopen is false, all files are opened at the same time. However the operating system might limit the number of open files. In Linux, the limit can be controled with the command ulimit.

All metadata (attributes and dimension length are assumed to be the same for all NetCDF files. Otherwise reading the attribute of a multi-file dataset would be ambiguous. An exception to this rule is the length of the dimension over which the data is aggregated. This aggregation dimension can varify from file to file.

Setting the experimental flag _aggdimconstant to true means that the length of the aggregation dimension is constant. This speeds up the creating of a multi-file dataset as only the metadata of the first file has to be loaded.

Examples:

You can use Glob.jl to make fnames from a file pattern, e.g.

using NCDatasets, Glob
 ds = NCDataset(glob("ERA5_monthly3D_reanalysis_*.nc"))

Aggregation over a new dimension:

using NCDatasets
@@ -27,7 +27,7 @@
 ds = NCDataset(["foo$i.nc" for i = 1:3],aggdim = "sample", isnewdim = true)
 size(ds["data"])
 # output
-# (4, 3)
source

Useful functions that operate on datasets are:

Base.keysMethod
keys(ds::NCDataset)

Return a list of all variables names in NCDataset ds.

source
Base.haskeyFunction
haskey(ds::NCDataset,name)
+# (4, 3)
source

Useful functions that operate on datasets are:

Base.keysMethod
keys(ds::NCDataset)

Return a list of all variables names in NCDataset ds.

source
Base.haskeyFunction
haskey(ds::NCDataset,name)
 haskey(d::Dimensions,name)
 haskey(ds::Attributes,name)

Return true if the NCDataset ds (or dimension/attribute list) has a variable (dimension/attribute) with the name name. For example:

ds = NCDataset("/tmp/test.nc","r")
 if haskey(ds,"temperature")
@@ -36,7 +36,7 @@
 
 if haskey(ds.dim,"lon")
     println("The file has a dimension 'lon'")
-end

This example checks if the file /tmp/test.nc has a variable with the name temperature and a dimension with the name lon.

source
Base.haskey(a::Attributes,name::SymbolOrString)

Check if name is an attribute

source
Base.getindexMethod
v = getindex(ds::AbstractDataset, varname::SymbolOrString)

Return the variable varname in the dataset ds as a CFVariable. The following CF convention are honored when the variable is indexed:

  • _FillValue or missing_value (which can be a list) will be returned as missing.
  • scale_factor and add_offset are applied (output = scale_factor * data_in_file + add_offset)
  • time variables (recognized by the units attribute and possibly the calendar attribute) are returned usually as DateTime object. Note that CFTime.DateTimeAllLeap, CFTime.DateTimeNoLeap and CF.TimeDateTime360Day cannot be converted to the proleptic gregorian calendar used in julia and are returned as such. (See CFTime.jl for more information about those date types.) If a calendar is defined but not among the ones specified in the CF convention, then the data in the file is not converted into a date structure.

A call getindex(ds, varname) is usually written as ds[varname].

If variable represents a cell boundary, the attributes calendar and units of the related variables are used, if they are not specified. For example:

dimensions:
+end

This example checks if the file /tmp/test.nc has a variable with the name temperature and a dimension with the name lon.

source
Base.haskey(a::Attributes,name::SymbolOrString)

Check if name is an attribute

source
Base.getindexMethod
v = getindex(ds::AbstractDataset, varname::SymbolOrString)

Return the variable varname in the dataset ds as a CFVariable. The following CF convention are honored when the variable is indexed:

  • _FillValue or missing_value (which can be a list) will be returned as missing.
  • scale_factor and add_offset are applied (output = scale_factor * data_in_file + add_offset)
  • time variables (recognized by the units attribute and possibly the calendar attribute) are returned usually as DateTime object. Note that CFTime.DateTimeAllLeap, CFTime.DateTimeNoLeap and CF.TimeDateTime360Day cannot be converted to the proleptic gregorian calendar used in julia and are returned as such. (See CFTime.jl for more information about those date types.) If a calendar is defined but not among the ones specified in the CF convention, then the data in the file is not converted into a date structure.

A call getindex(ds, varname) is usually written as ds[varname].

If variable represents a cell boundary, the attributes calendar and units of the related variables are used, if they are not specified. For example:

dimensions:
   time = UNLIMITED; // (5 currently)
   nv = 2;
 variables:
@@ -44,7 +44,7 @@
     time:long_name = "time";
     time:units = "hours since 1998-04-019 06:00:00";
     time:bounds = "time_bnds";
-  double time_bnds(time,nv);

In this case, the variable time_bnds uses the units and calendar of time because both variables are related thought the bounds attribute following the CF conventions.

See also cfvariable(ds, varname).

source
CommonDataModel.variableFunction
v = variable(ds::NCDataset,varname::String)

Return the NetCDF variable varname in the dataset ds as a NCDataset.Variable. No scaling or other transformations are applied when the variable v is indexed.

source
CommonDataModel.variable(ds::AbstractDataset,variablename::SymbolOrString)

Return the variable with the name variablename from the data set ds.

source
CommonDataModel.cfvariableFunction
v = cfvariable(ds::NCDataset,varname::SymbolOrString; <attrib> = <value>)

Return the variable varname in the dataset ds as a NCDataset.CFVariable. The keyword argument <attrib> are the attributes (fillvalue, missing_value, scale_factor, add_offset, units and calendar) relevant to the CF conventions. By specifing the value of these attributes, the one can override the value specified in the data set. If the attribute is set to nothing, then the attribute is not loaded and the corresponding transformation is ignored. This function is similar to ds[varname] with the additional flexibility that some variable attributes can be overridden.

Example:

NCDataset("foo.nc","c") do ds
+  double time_bnds(time,nv);

In this case, the variable time_bnds uses the units and calendar of time because both variables are related thought the bounds attribute following the CF conventions.

See also cfvariable(ds, varname).

source
CommonDataModel.variableFunction
v = variable(ds::NCDataset,varname::String)

Return the NetCDF variable varname in the dataset ds as a NCDataset.Variable. No scaling or other transformations are applied when the variable v is indexed.

source
CommonDataModel.variable(ds::AbstractDataset,variablename::SymbolOrString)

Return the variable with the name variablename from the data set ds.

source
CommonDataModel.cfvariableFunction
v = cfvariable(ds::NCDataset,varname::SymbolOrString; <attrib> = <value>)

Return the variable varname in the dataset ds as a NCDataset.CFVariable. The keyword argument <attrib> are the attributes (fillvalue, missing_value, scale_factor, add_offset, units and calendar) relevant to the CF conventions. By specifing the value of these attributes, the one can override the value specified in the data set. If the attribute is set to nothing, then the attribute is not loaded and the corresponding transformation is ignored. This function is similar to ds[varname] with the additional flexibility that some variable attributes can be overridden.

Example:

NCDataset("foo.nc","c") do ds
   defVar(ds,"data",[10., 11., 12., 13.], ("time",), attrib = Dict(
       "add_offset" => 10.,
       "scale_factor" => 0.2))
@@ -77,8 +77,8 @@
 @show cfvariable(ds,"data", units = "days since 2000-01-01")[:]
 # returns [DateTime(2000,1,11), DateTime(2000,1,12), DateTime(2000,1,13), DateTime(2000,1,14)]
 
-close(ds)
source
CommonDataModel.syncFunction
sync(ds::NCDataset)

Write all changes in NCDataset ds to the disk.

source
Base.closeFunction
close(ds::NCDataset)

Close the NCDataset ds. All pending changes will be written to the disk.

source
CommonDataModel.pathFunction
path(ds::NCDataset)

Return the file path (or the opendap URL) of the NCDataset ds

source
CommonDatamodel.path(ds::AbstractDataset)

File path of the data set ds.

source
NCDatasets.ncgenFunction
ncgen(fname; ...)
-ncgen(fname,jlname; ...)

Generate the Julia code that would produce a NetCDF file with the same metadata as the NetCDF file fname. The code is placed in the file jlname or printed to the standard output. By default the new NetCDF file is called filename.nc. This can be changed with the optional parameter newfname.

source
CommonDataModel.varbyattribFunction
varbyattrib(ds, attname = attval)

Returns a list of variable(s) which has the attribute attname matching the value attval in the dataset ds. The list is empty if the none of the variables has the match. The output is a list of CFVariables.

Examples

Load all the data of the first variable with standard name "longitude" from the NetCDF file results.nc.

julia> ds = NCDataset("results.nc", "r");
+close(ds)
source
CommonDataModel.syncFunction
sync(ds::NCDataset)

Write all changes in NCDataset ds to the disk.

source
Base.closeFunction
close(ds::NCDataset)

Close the NCDataset ds. All pending changes will be written to the disk.

source
CommonDataModel.pathFunction
path(ds::NCDataset)

Return the file path (or the opendap URL) of the NCDataset ds

source
CommonDatamodel.path(ds::AbstractDataset)

File path of the data set ds.

source
NCDatasets.ncgenFunction
ncgen(fname; ...)
+ncgen(fname,jlname; ...)

Generate the Julia code that would produce a NetCDF file with the same metadata as the NetCDF file fname. The code is placed in the file jlname or printed to the standard output. By default the new NetCDF file is called filename.nc. This can be changed with the optional parameter newfname.

source
CommonDataModel.varbyattribFunction
varbyattrib(ds, attname = attval)

Returns a list of variable(s) which has the attribute attname matching the value attval in the dataset ds. The list is empty if the none of the variables has the match. The output is a list of CFVariables.

Examples

Load all the data of the first variable with standard name "longitude" from the NetCDF file results.nc.

julia> ds = NCDataset("results.nc", "r");
 julia> data = varbyattrib(ds, standard_name = "longitude")[1][:]
source
Base.writeFunction
write(dest::AbstractDataset, src::AbstractDataset; include = keys(src), exclude = [])

Write the variables of src dataset into an empty dest dataset (which must be opened in mode "a" or "c"). The keywords include and exclude configure which variable of src should be included (by default all), or which should be excluded (by default none).

If the first argument is a file name, then the dataset is open in create mode ("c").

This function is useful when you want to save the dataset from a multi-file dataset.

To save a subset, one can use the view function view to virtually slice a dataset:

Example

NCDataset(fname_src) do ds
     write(fname_slice,view(ds, lon = 2:3))
 end

All variables in the source file fname_src with a dimension lon will be sliced along the indices 2:3 for the lon dimension. All attributes (and variables without a dimension lon) will be copied over unmodified.

source

Notice that DateTime-structures from CFTime are used to represent time for non-standard calendars. Otherwise, we attempt to use standard structures from the Julia standard library Dates.

Groups

A NetCDF group is a dataset (with variables, attributes, dimensions and sub-groups) and can be arbitrarily nested. A group is created with defGroup and accessed via the group property of a NCDataset.

# create the variable "temperature" inside the group "forecast"
@@ -88,7 +88,7 @@
 
 # load the variable "temperature" inside the group "forecast"
 forecast_temp = ds.group["forecast"]["temperature"][:,:,:]
-close(ds)
CommonDataModel.defGroupFunction
defGroup(ds::NCDataset,groupname; attrib = []))

Create the group with the name groupname in the dataset ds. attrib is a list of attribute name and attribute value pairs (see NCDataset).

source
group = CommonDatamodel.defGroup(ds::AbstractDataset,name::SymbolOrString)

Create an empty sub-group with the name name in the data set ds. The group is a sub-type of AbstractDataset.

source
Base.getindexMethod
group = getindex(g::Groups,groupname::AbstractString)

Return the NetCDF group with the name groupname from the parent group g.

For example:

ds = NCDataset("results.nc", "r");
+close(ds)
CommonDataModel.defGroupFunction
defGroup(ds::NCDataset,groupname; attrib = []))

Create the group with the name groupname in the dataset ds. attrib is a list of attribute name and attribute value pairs (see NCDataset).

source
group = CommonDatamodel.defGroup(ds::AbstractDataset,name::SymbolOrString)

Create an empty sub-group with the name name in the data set ds. The group is a sub-type of AbstractDataset.

source
Base.getindexMethod
group = getindex(g::Groups,groupname::AbstractString)

Return the NetCDF group with the name groupname from the parent group g.

For example:

ds = NCDataset("results.nc", "r");
 forecast_group = ds.group["forecast"]
 forecast_temp = forecast_group["temperature"]
source
Base.keysMethod
names = keys(g::Groups)

Return the names of all subgroubs of the group g.

source

Common methods

One can iterate over a dataset, attribute list, dimensions and NetCDF groups.

for (varname,var) in ds
     # all variables
@@ -103,4 +103,4 @@
 for (groupname,group) in ds.groups
     # all groups
     @show (groupname,group)
-end
+end diff --git a/dev/dimensions/index.html b/dev/dimensions/index.html index a19c3ff4..d0fb0319 100644 --- a/dev/dimensions/index.html +++ b/dev/dimensions/index.html @@ -9,7 +9,7 @@ if haskey(ds.dim,"lon") println("The file has a dimension 'lon'") -end

This example checks if the file /tmp/test.nc has a variable with the name temperature and a dimension with the name lon.

source
CommonDataModel.defDimFunction
defDim(ds::NCDataset,name,len)

Define a dimension in the data set ds with the given name and length len. If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.

For example:

using NCDatasets
+end

This example checks if the file /tmp/test.nc has a variable with the name temperature and a dimension with the name lon.

source
CommonDataModel.defDimFunction
defDim(ds::NCDataset,name,len)

Define a dimension in the data set ds with the given name and length len. If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.

For example:

using NCDatasets
 ds = NCDataset("/tmp/test.nc","c")
 defDim(ds,"lon",100)
 # [...]
@@ -24,8 +24,8 @@
 ds["unlimited_variable"][:,:,1:4] = randn(10,10,4)
 @show ds.dim["time"]
 # returns now 4 as 4 time slice have been added
-close(ds)
source
CommonDatamodel.defDim(ds::AbstractDataset,name::SymbolOrString,len)

Create dimension with the name name in the data set ds with the length len. len can be Inf for unlimited dimensions.

source
CommonDataModel.unlimitedMethod
unlimited(d::Dimensions)

Return the names of all unlimited dimensions.

source
Base.setindex!Method
setindex!(d::Dimensions,len,name::AbstractString)

Defines the dimension called name to the length len, for example:

ds = NCDataset("file.nc","c")
-ds.dim["longitude"] = 100

If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.

source
NCDatasets.renameDimMethod
renameDim(ds::NCDataset,oldname::SymbolOrString,newname::SymbolOrString)

Renames the dimenion oldname in the dataset ds with the name newname.

source

One can iterate over a list of dimensions as follows:

for (dimname,dim) in ds.dim
+close(ds)
source
CommonDatamodel.defDim(ds::AbstractDataset,name::SymbolOrString,len)

Create dimension with the name name in the data set ds with the length len. len can be Inf for unlimited dimensions.

source
CommonDataModel.unlimitedMethod
unlimited(d::Dimensions)

Return the names of all unlimited dimensions.

source
Base.setindex!Method
setindex!(d::Dimensions,len,name::AbstractString)

Defines the dimension called name to the length len, for example:

ds = NCDataset("file.nc","c")
+ds.dim["longitude"] = 100

If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.

source
NCDatasets.renameDimMethod
renameDim(ds::NCDataset,oldname::SymbolOrString,newname::SymbolOrString)

Renames the dimenion oldname in the dataset ds with the name newname.

source

One can iterate over a list of dimensions as follows:

for (dimname,dim) in ds.dim
     # all dimensions
     @show (dimname,dim)
-end
+end diff --git a/dev/index.html b/dev/index.html index 0a7bbed4..4940f1bb 100644 --- a/dev/index.html +++ b/dev/index.html @@ -135,4 +135,4 @@ # if the attribute does not exists units = get(v,"units","adimensional") -close(ds)

API and semantic versioning

The package aims to following semantic versioning. As in julia, what is considered as public API and covered by semantic versioning is what documented and not marked as experimental or internal.

+close(ds)

API and semantic versioning

The package aims to following semantic versioning. As in julia, what is considered as public API and covered by semantic versioning is what documented and not marked as experimental or internal.

diff --git a/dev/issues/index.html b/dev/issues/index.html index c9280348..73ce9aab 100644 --- a/dev/issues/index.html +++ b/dev/issues/index.html @@ -36,4 +36,4 @@ @ stdin:1 during initialization of module NetCDF_jll

You will likely have similar issues with julia installed from other package managers (like Debian/Ubuntu apt, Homebrew...). The only supported solution is to install the offical julia builds.

Even when the official build of julia, this error can occur on Linux if an incompatible library is loaded when the user set LD_LIBRARY_PATH and LD_PRELOAD:

ERROR: LoadError: InitError: could not load library "/home/user/.julia/artifacts/461703969206dd426cc6b4d99f69f6ffab2a9779/lib/libnetcdf.so"
 /usr/lib/x86_64-linux-gnu/libcurl.so: version `CURL_4' not found (required by /home/user/.julia/artifacts/461703969206dd426cc6b4d99f69f6ffab2a9779/lib/libnetcdf.so)

Please make sure that your LD_LIBRARY_PATH and LD_PRELOAD are empty or verify that the all loaded libraries are binary compatible. You can check these environement variables by running the following commands in a terminal:

echo $LD_PRELOAD
-echo $LD_LIBRARY_PATH

If you must set $LD_LIBRARY_PATH for some application, consider to use a wrapper script for this application or recompiling the application with the -rpath linker option rather than setting this variable globally.

Corner cases

+echo $LD_LIBRARY_PATH

If you must set $LD_LIBRARY_PATH for some application, consider to use a wrapper script for this application or recompiling the application with the -rpath linker option rather than setting this variable globally.

Corner cases

diff --git a/dev/other/index.html b/dev/other/index.html index 99184158..5e3f82fe 100644 --- a/dev/other/index.html +++ b/dev/other/index.html @@ -197,4 +197,4 @@ # 1.0 2.0 3.0 # NaN 20.0 30.0

Promoting an integer to a floating point number can lead to loss of precision. These are the smallest integers that cannot be represented as 32 and 64-bit floating numbers:

Float32(16_777_217) == 16_777_217 # false
 Float64(9_007_199_254_740_993) == 9_007_199_254_740_993 # false

NaN should not be used for an array of dates, character or strings as it will result in an array with the element type Any following julia's promotion rules. The use of missing as fill value, is thus preferable in the general case.

Experimental functions

CommonDataModel.ancillaryvariablesFunction
ncvar = CommonDataModel.ancillaryvariables(ncv::CFVariable,modifier)

Return the first ancillary variables from the NetCDF (or other format) variable ncv with the standard name modifier modifier. It can be used for example to access related variable like status flags.

source
Base.filterFunction
data = CommonDataModel.filter(ncv, indices...; accepted_status_flags = nothing)

Load and filter observations by replacing all variables without an acepted status flag to missing. It is used the attribute ancillary_variables to identify the status flag.

# da["data"] is 2D matrix
-good_data = NCDatasets.filter(ds["data"],:,:, accepted_status_flags = ["good_data","probably_good_data"])
source
+good_data = NCDatasets.filter(ds["data"],:,:, accepted_status_flags = ["good_data","probably_good_data"])source diff --git a/dev/performance/index.html b/dev/performance/index.html index 146f8e11..44e63f3a 100644 --- a/dev/performance/index.html +++ b/dev/performance/index.html @@ -28,4 +28,4 @@ v = ds["v1"][:,1:3,:]; # fast v = ds["v1"][:,:,CartesianIndex(1)] # slow v = ds["v1"][:,:,1] # fast -close(ds) +close(ds) diff --git a/dev/tutorials/index.html b/dev/tutorials/index.html index c8ad58e1..3c0f9e11 100644 --- a/dev/tutorials/index.html +++ b/dev/tutorials/index.html @@ -138,4 +138,4 @@ Dimensions: lon × lat Attributes: long_name = 4um Sea Surface Temperature -[...]

The example requires NCDatasets 0.12.5 which allows one to read a NetCDF dataset directly from a vector of bytes in memory.

To debug, it is useful to run the aws shell command to list all keys in the buckets (it requires the AWS_* environment variables to be set):

aws s3 ls s3://podaac-ops-cumulus-protected/MODIS_TERRA_L3_SST_THERMAL_DAILY_4KM_NIGHTTIME_V2019.0/
+[...]

The example requires NCDatasets 0.12.5 which allows one to read a NetCDF dataset directly from a vector of bytes in memory.

To debug, it is useful to run the aws shell command to list all keys in the buckets (it requires the AWS_* environment variables to be set):

aws s3 ls s3://podaac-ops-cumulus-protected/MODIS_TERRA_L3_SST_THERMAL_DAILY_4KM_NIGHTTIME_V2019.0/
diff --git a/dev/variables/index.html b/dev/variables/index.html index 415c34b9..dc2f6ae0 100644 --- a/dev/variables/index.html +++ b/dev/variables/index.html @@ -24,8 +24,8 @@ ncvar = ds["time"].var # or ncvar = variable(ds,"time") -data = ncvar[:] # here [0., 1.]

The variable ncvar can be indexed in the same way as ncvar_cf explained above.

Note

NCDatasets.Variable and NCDatasets.CFVariable implement the interface of AbstractArray. It is thus possible to call any function that accepts an AbstractArray. But functions like mean, sum (and many more) would load every element individually which is very inefficient for large fields read from disk. You should instead convert such a variable to a standard Julia Array and then do computations with it. See also the performance tips for more information.

The following functions are convenient for working with variables:

Base.sizeMethod
sz = size(var::CFVariable)

Return a tuple of integers with the size of the variable var.

Note

Note that the size of a variable can change, i.e. for a variable with an unlimited dimension.

source
CommonDataModel.dimnamesFunction
names = dimnames(ds::AbstractNCDataset; parents = false)

Return all names defined in ds. When parents is true, also the names of parent groups are returned (default is false).

source
dimnames(v::Variable)

Return a tuple of strings with the dimension names of the variable v.

source
CommonDataModel.dimnames(v::AbstractVariable)

Return an iterable of the dimension names of the variable v.

source
dimnames(v::CFVariable)

Return a tuple of strings with the dimension names of the variable v.

source
CommonDatamodel.dimnames(ds::AbstractDataset)

Return an iterable of all dimension names in ds. This information can also be accessed using the property ds.dim:

Examples

ds = NCDataset("results.nc", "r");
-dimnames = keys(ds.dim)
source
NCDatasets.dimsizeFunction
dimsize(v::CFVariable)

Get the size of a CFVariable as a named tuple of dimension → length.

source
CommonDataModel.nameFunction
name(ds::NCDataset)

Return the group name of the NCDataset ds

source
name(v::Variable)

Return the name of the NetCDF variable v.

source
CommonDatamodel.name(ds::AbstractDataset)

Name of the group of the data set ds. For a data set containing only a single group, this will be always the root group "/".

source
CommonDataModel.name(v::AbstractVariable)

Return the name of the variable v as a string.

source
NCDatasets.renameVarFunction
renameVar(ds::NCDataset,oldname,newname)

Rename the variable called oldname to newname.

source
NCDatasets.NCDatasetMethod
mfds = NCDataset(fnames, mode = "r"; aggdim = nothing, deferopen = true,
+data = ncvar[:] # here [0., 1.]

The variable ncvar can be indexed in the same way as ncvar_cf explained above.

Note

NCDatasets.Variable and NCDatasets.CFVariable implement the interface of AbstractArray. It is thus possible to call any function that accepts an AbstractArray. But functions like mean, sum (and many more) would load every element individually which is very inefficient for large fields read from disk. You should instead convert such a variable to a standard Julia Array and then do computations with it. See also the performance tips for more information.

The following functions are convenient for working with variables:

Base.sizeMethod
sz = size(var::CFVariable)

Return a tuple of integers with the size of the variable var.

Note

Note that the size of a variable can change, i.e. for a variable with an unlimited dimension.

source
CommonDataModel.dimnamesFunction
names = dimnames(ds::AbstractNCDataset; parents = false)

Return all names defined in ds. When parents is true, also the names of parent groups are returned (default is false).

source
dimnames(v::Variable)

Return a tuple of strings with the dimension names of the variable v.

source
CommonDataModel.dimnames(v::AbstractVariable)

Return an iterable of the dimension names of the variable v.

source
dimnames(v::CFVariable)

Return a tuple of strings with the dimension names of the variable v.

source
CommonDatamodel.dimnames(ds::AbstractDataset)

Return an iterable of all dimension names in ds. This information can also be accessed using the property ds.dim:

Examples

ds = NCDataset("results.nc", "r");
+dimnames = keys(ds.dim)
source
NCDatasets.dimsizeFunction
dimsize(v::CFVariable)

Get the size of a CFVariable as a named tuple of dimension → length.

source
CommonDataModel.nameFunction
name(ds::NCDataset)

Return the group name of the NCDataset ds

source
name(v::Variable)

Return the name of the NetCDF variable v.

source
CommonDatamodel.name(ds::AbstractDataset)

Name of the group of the data set ds. For a data set containing only a single group, this will be always the root group "/".

source
CommonDataModel.name(v::AbstractVariable)

Return the name of the variable v as a string.

source
NCDatasets.NCDatasetMethod
mfds = NCDataset(fnames, mode = "r"; aggdim = nothing, deferopen = true,
               isnewdim = false,
               constvars = [])

Opens a multi-file dataset in read-only "r" or append mode "a". fnames is a vector of file names.

Variables are aggregated over the first unlimited dimension or over the dimension aggdim if specified. Variables without the dimensions aggdim are not aggregated. All variables containing the dimension aggdim are aggregated. The variable who do not contain the dimension aggdim are assumed constant.

If variables should be aggregated over a new dimension (not present in the NetCDF file), one should set isnewdim to true. All NetCDF files should have the same variables, attributes and groupes. Per default, all variables will have an additional dimension unless they are marked as constant using the constvars parameter.

The append mode is only implemented when deferopen is false. If deferopen is false, all files are opened at the same time. However the operating system might limit the number of open files. In Linux, the limit can be controled with the command ulimit.

All metadata (attributes and dimension length are assumed to be the same for all NetCDF files. Otherwise reading the attribute of a multi-file dataset would be ambiguous. An exception to this rule is the length of the dimension over which the data is aggregated. This aggregation dimension can varify from file to file.

Setting the experimental flag _aggdimconstant to true means that the length of the aggregation dimension is constant. This speeds up the creating of a multi-file dataset as only the metadata of the first file has to be loaded.

Examples:

You can use Glob.jl to make fnames from a file pattern, e.g.

using NCDatasets, Glob
 ds = NCDataset(glob("ERA5_monthly3D_reanalysis_*.nc"))

Aggregation over a new dimension:

using NCDatasets
@@ -38,9 +38,9 @@
 ds = NCDataset(["foo$i.nc" for i = 1:3],aggdim = "sample", isnewdim = true)
 size(ds["data"])
 # output
-# (4, 3)
source
NCDatasets.nomissingFunction
a = nomissing(da)

Return the values of the array da of type Array{Union{T,Missing},N} (potentially containing missing values) as a regular Julia array a of the same element type. It raises an error if the array contains at least one missing value.

source
a = nomissing(da,value)

Retun the values of the array da of type AbstractArray{Union{T,Missing},N} as a regular Julia array a by replacing all missing value by value (converted to type T). This function is identical to coalesce.(da,T(value)) where T is the element type of da.

Example:

julia> nomissing([missing,1.,2.],NaN)
-# returns [NaN, 1.0, 2.0]
source
NCDatasets.nomissingFunction
a = nomissing(da)

Return the values of the array da of type Array{Union{T,Missing},N} (potentially containing missing values) as a regular Julia array a of the same element type. It raises an error if the array contains at least one missing value.

source
a = nomissing(da,value)

Retun the values of the array da of type AbstractArray{Union{T,Missing},N} as a regular Julia array a by replacing all missing value by value (converted to type T). This function is identical to coalesce.(da,T(value)) where T is the element type of da.

Example:

julia> nomissing([missing,1.,2.],NaN)
+# returns [NaN, 1.0, 2.0]
source
CommonDataModel.fillvalueFunction
fv = fillvalue(v::Variable)
+fv = fillvalue(v::CFVariable)

Return the fill-value of the variable v.

source
fillvalue(::Type{Int8})
 fillvalue(::Type{UInt8})
 fillvalue(::Type{Int16})
 fillvalue(::Type{UInt16})
@@ -51,7 +51,7 @@
 fillvalue(::Type{Float32})
 fillvalue(::Type{Float64})
 fillvalue(::Type{Char})
-fillvalue(::Type{String})

Default fill-value for the given type from NetCDF.

source
CommonDataModel.load!Function
NCDatasets.load!(ncvar::Variable, data, indices)

Loads a NetCDF variables ncvar in-place and puts the result in data along the specified indices. One can use @inbounds annotate code where bounds checking can be elided by the compiler (which typically require type-stable code).

using NCDatasets
+fillvalue(::Type{String})

Default fill-value for the given type from NetCDF.

source
CommonDataModel.load!Function
NCDatasets.load!(ncvar::Variable, data, indices)

Loads a NetCDF variables ncvar in-place and puts the result in data along the specified indices. One can use @inbounds annotate code where bounds checking can be elided by the compiler (which typically require type-stable code).

using NCDatasets
 ds = NCDataset("file.nc")
 ncv = ds["vgos"].var;
 # data must have the right shape and type
@@ -63,7 +63,7 @@
 
 # loading a subset
 data = zeros(5); # must have the right shape and type
-load!(ds["temp"].var,data,:,1) # loads the 1st column
Note

For a netCDF variable of type NC_CHAR, the element type of the data array must be UInt8 and cannot be the julia Char type, because the julia Char type uses 4 bytes and the NetCDF NC_CHAR only 1 byte.

source
CommonDataModel.load!(ncvar::CFVariable, data, buffer, indices)

Loads a NetCDF (or other format) variables ncvar in-place and puts the result in data (an array of eltype(ncvar)) along the specified indices. buffer is a temporary array of the same size as data but the type should be eltype(ncv.var), i.e. the corresponding type in the files (before applying scale_factor, add_offset and masking fill values). Scaling and masking will be applied to the array data.

data and buffer can be the same array if eltype(ncvar) == eltype(ncvar.var).

Example:

# create some test array
+load!(ds["temp"].var,data,:,1) # loads the 1st column
Note

For a netCDF variable of type NC_CHAR, the element type of the data array must be UInt8 and cannot be the julia Char type, because the julia Char type uses 4 bytes and the NetCDF NC_CHAR only 1 byte.

source
CommonDataModel.load!(ncvar::CFVariable, data, buffer, indices)

Loads a NetCDF (or other format) variables ncvar in-place and puts the result in data (an array of eltype(ncvar)) along the specified indices. buffer is a temporary array of the same size as data but the type should be eltype(ncv.var), i.e. the corresponding type in the files (before applying scale_factor, add_offset and masking fill values). Scaling and masking will be applied to the array data.

data and buffer can be the same array if eltype(ncvar) == eltype(ncvar.var).

Example:

# create some test array
 Dataset("file.nc","c") do ds
     defDim(ds,"time",3)
     ncvar = defVar(ds,"vgos",Int16,("time",),attrib = ["scale_factor" => 0.1])
@@ -88,12 +88,12 @@
              "scale_factor" => 0.1,
              "long_name" => "Temperature"
           ))
-       end;
Note

If the attributes _FillValue, missing_value, add_offset, scale_factor, units and calendar are used, they should be defined when calling defVar by using the parameter attrib as shown in the example above.

source
v = CommonDataModel.defVar(ds::AbstractDataset,src::AbstractVariable)
-v = CommonDataModel.defVar(ds::AbstractDataset,name::SymbolOrString,src::AbstractVariable)

Defines and return the variable in the data set ds copied from the variable src. The dimension name, attributes and data are copied from src as well as the variable name (unless provide by name).

source

Storage parameter of a variable

CommonDataModel.chunkingFunction
storage,chunksizes = chunking(v::Variable)

Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v. Note that chunking reports the same information as nc_inq_var_chunking and therefore considers variables with unlimited dimension as :contiguous.

source
storage,chunksizes = chunking(v::MFVariable)
-storage,chunksizes = chunking(v::MFCFVariable)

Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v corresponding to the first file. If the first file in the collection is chunked then this storage attributes are returned. If not the first file is not contiguous, then multi-file variable is still reported as chunked with chunk size equal to the size of the first variable.

source
CommonDataModel.deflateFunction
isshuffled,isdeflated,deflate_level = deflate(v::Variable)

Return compression information of the variable v. If shuffle is true, then shuffling (byte interlacing) is activated. If deflate is true, then the data chunks (see chunking) are compressed using the compression level deflate_level (0 means no compression and 9 means maximum compression).

source
CommonDataModel.checksumFunction
checksummethod = checksum(v::Variable)

Return the checksum method of the variable v which can be either be :fletcher32 or :nochecksum.

source

Coordinate variables and cell boundaries

CommonDataModel.coordFunction
cv = coord(v::Union{CFVariable,Variable},standard_name)

Find the coordinate of the variable v by the standard name standard_name or some standardized heuristics based on units. If the heuristics fail to detect the coordinate, consider to modify the file to add the standard_name attribute. All dimensions of the coordinate must also be dimensions of the variable v.

Example

using NCDatasets
+       end;
Note

If the attributes _FillValue, missing_value, add_offset, scale_factor, units and calendar are used, they should be defined when calling defVar by using the parameter attrib as shown in the example above.

source
v = CommonDataModel.defVar(ds::AbstractDataset,src::AbstractVariable)
+v = CommonDataModel.defVar(ds::AbstractDataset,name::SymbolOrString,src::AbstractVariable)

Defines and return the variable in the data set ds copied from the variable src. The dimension name, attributes and data are copied from src as well as the variable name (unless provide by name).

source

Storage parameter of a variable

CommonDataModel.chunkingFunction
storage,chunksizes = chunking(v::Variable)

Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v. Note that chunking reports the same information as nc_inq_var_chunking and therefore considers variables with unlimited dimension as :contiguous.

source
storage,chunksizes = chunking(v::MFVariable)
+storage,chunksizes = chunking(v::MFCFVariable)

Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v corresponding to the first file. If the first file in the collection is chunked then this storage attributes are returned. If not the first file is not contiguous, then multi-file variable is still reported as chunked with chunk size equal to the size of the first variable.

source
CommonDataModel.deflateFunction
isshuffled,isdeflated,deflate_level = deflate(v::Variable)

Return compression information of the variable v. If shuffle is true, then shuffling (byte interlacing) is activated. If deflate is true, then the data chunks (see chunking) are compressed using the compression level deflate_level (0 means no compression and 9 means maximum compression).

source
CommonDataModel.checksumFunction
checksummethod = checksum(v::Variable)

Return the checksum method of the variable v which can be either be :fletcher32 or :nochecksum.

source

Coordinate variables and cell boundaries

CommonDataModel.coordFunction
cv = coord(v::Union{CFVariable,Variable},standard_name)

Find the coordinate of the variable v by the standard name standard_name or some standardized heuristics based on units. If the heuristics fail to detect the coordinate, consider to modify the file to add the standard_name attribute. All dimensions of the coordinate must also be dimensions of the variable v.

Example

using NCDatasets
 ds = NCDataset("file.nc")
 ncv = ds["SST"]
 lon = coord(ncv,"longitude")[:]
 lat = coord(ncv,"latitude")[:]
 v = ncv[:]
-close(ds)
source
CommonDataModel.boundsFunction
b = bounds(ncvar::NCDatasets.CFVariable)

Return the CFVariable corresponding to the bounds attribute of the variable ncvar. The time units and calendar from the ncvar are used but not the attributes controling the packing of data scale_factor, add_offset and _FillValue.

source
+close(ds)source
CommonDataModel.boundsFunction
b = bounds(ncvar::NCDatasets.CFVariable)

Return the CFVariable corresponding to the bounds attribute of the variable ncvar. The time units and calendar from the ncvar are used but not the attributes controling the packing of data scale_factor, add_offset and _FillValue.

source