GRIB collection design #732
Replies: 1 comment
-
TwoD time2D vs TwoD regular.In both cases, the netcdf time coordinate variable has to be 2D, and the variables using it have two time coordinates, for example:
where 65 is the maximum forecast times for any reference time. The coordinate is stored with NaNs to indicate where those are missing:
The advantage of regular is that we can store just the regular pattern in the ncx, which applies to all reference times. When its not regular, we have to store the offsets for every reftime. This gets very large as the number of reference times grow. So for regular, we have to store N + M*H values, and for time2D we have to store N x M values in the ncx4, where
For regular, we can generate the time2D coordinate on the fly for Netcdf. For Grids, we never have to instantiate the coordinates unless the user asks for them explicitly. In principle, the time period of regularity doesnt have to be one day, but in practice we havent seen any other case. So its worth doing when possible. |
Beta Was this translation helpful? Give feedback.
-
These are the types of GRIB Collections, with explanations and issues.
SRC
SRC (Single Runtime Collection) is the ideal for model data. It requires that all the data for a model run is in one file (PartitionType = file) or in one directory (PartitionType = directory).
In the GribCollection ncx, the time coordinate is a time2D (1 X ntimes) orthogonal:
In the Netcdf representation, the variables have a 1D time coordinate with a scalar reference time:
MRUTC
MRUTC (Multiple Runtime Unique Time Collection) has multiple reference times, but a single unique time coordinate for each. Its likely that the only case we've seen is that time offest == 0, and reference time == valid time. We could rename this case "Observation Collection", or OBS.
In the GribCollection ncx, the time coordinate is a time2D (nruns X 1) orthogonal:
In the Netcdf representation, the variables have a 1D time coordinate with an auxillary 1D reference time with values identical to the runtime:
MRUTP
MRUTP is identical to an MRUTC. It occurs when one collects MRUTCs. The times are unique. The number of times can get quite long, and may be irregular. In the MRMS, they are approx every 2 mins, and each variable has their own.
An MRUTP may also get created with collecting SRCs with a single forecast time, such as "analysis" datasets like Global_0p5deg_ana. These are represented identially to OBS datasets, but have many fewer times (123 vs 22K), which are shared amonst variables.
***** Why does GFS-Global_0p5deg_ana.ncx4 runtime have units of seconds, not hours??**
TwoD
TwoD (2D time coordinates) are generated from collections of SRCs. Variables have a runtime and usually a 1D offsetTime coordinate, and an auxiliary 2D time coordinate in the Netcdf representation:
And the twoD time coordinate has NaNs where needed:
There are variations in the GribCollection ncx4 representation:
TwoD orthogonal
(Multiple runtimes with identical time offsets for each) is the ideal for collections of model runs. These are stored in the ncx4 as two 1D coordinates, runtime and offsetTime.
TwoD regular
(Multiple runtimes with identical time offsets for each "runtime minute of day") is next best for collections of model runs. The offsetTime is stored in the ncx4 as M 1D coordinates, where M are the number of runs in the day. These are identical for all days.
TwoD time2D
(Multiple runtimes with irregular time offsets across days) is the general case for collections of model runs. The offsetTime is stored in the ncx4 as N 1D coordinates, where N is the number of runs overall. When N becomes large, the size of these coordinates can start to limit what can be stored in memory. These are a good candidate for a meta-collecion.
The offset time coordinate is not put into the Netcdf, just the 1D reftime and the 2D validtime:
Beta Was this translation helpful? Give feedback.
All reactions