CDF-5 support? #1161
Replies: 6 comments 7 replies
-
I'm not sure, I'll look into it and get back to you :) |
Beta Was this translation helpful? Give feedback.
-
Do we have any sample CDF-5 files? |
Beta Was this translation helpful? Give feedback.
-
Im looking over the pnetcdf format to remember why we didnt implement it. It seems that the main problem is that dimension lengths are int64s, whereas in the classic model they are int32s. Its not a big deal to widen, but its a breaking change to the API in lots of places, since indices have to also be int64. Im looking at the netcdf4 api to understand how it supports pnetcdf. It is using size_t everywhere for dimension lengths and indexes, which i didnt realize until now. so, we could switch to using int64 for dimensions and indices in netcdf-java, but it would be a breaking change. Reading now using the netcdf-C JNA interface must be truncating everything to 32 bits, so it will only work on variables with dimensions of length < 2^31. It would be simple to extend the existing code to support pnetcdf4 (using long offsets like CDF2) with the same restrictions (indiviidual dimension lengths < 2^31 (maybe <2^32). Not a problem for the variable to have more than that total nelems, as long as each dimension length was an int32. Note one can only ask to read total nelems < 2^31, since thats as big as Java arrays get. Ive always thought it was ok to require the user to break their reads up into nelems < 2^31. To lift that restriction would be significantly more work. |
Beta Was this translation helpful? Give feedback.
-
Yeah, any non-toy program has to break things up to fit into physical memory. Im looking at the file you sent to understand the format better. They are using int64 for all kinds of things, including eg the number of dimensions, attributes and variables. I hope no one has more than a few hundred of any of those. My linear searches will get really slow. Might even have some O(n^2) problems. So maybe they dont really expect any one dimension to exceed 2^32, and we could just barf if we encoutered that. Maybe we should ask pnetcdf people if that happens much. Under that assumption, the only important improvement of CDF5 over CDF2 is 1) number of records not limited to 2^32, and 2) the total size in bytes of a variable also not limited. |
Beta Was this translation helpful? Give feedback.
-
well, the right thing to do is to use int64s, but thats not going to happen in netcdf-java ver5. Ask them if they anticipate any dimension lengths (including unlimited) being larger than 2^31-1 (not 2^32-1, because dimensions are signed ints). If not, we could support that. |
Beta Was this translation helpful? Give feedback.
-
Noting that this issue came up again a year later, after I'd completely forgotten about this discussion. Thus, I added a logger to N3headerNew to provide the barest hint of a reason an CDF5 file couldn't be opened. See #1350 |
Beta Was this translation helpful? Give feedback.
-
Does netCDF-Java support NC files in the CDF-5 format?
A user pinged me because the developers of the regional climate model he uses are apparently updating their code to produce data output in CDF-5 format. Panoply (using a NJ 5.5.4 snapshot) cannot open the file and the stack trace I get checking into this says
java.io.IOException: java.io.IOException: Cant read file:/Users/rbs/Downloads/jays_file_DOMAIN000.nc: not a valid CDM file. at ucar.nc2.NetcdfFiles.open(NetcdfFiles.java:279) at ucar.nc2.dataset.NetcdfDatasets.openProtocolOrFile(NetcdfDatasets.java:455) at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:152) at ucar.nc2.dataset.NetcdfDatasets.acquireDataset(NetcdfDatasets.java:280) at ucar.nc2.dataset.NetcdfDatasets.acquireDataset(NetcdfDatasets.java:257) at gov.nasa.giss.data.nc.NcDataset.acquireNjDataset(NcDataset.java:668)
I also get the java.io.IOException trying to open the file using the latest IDV.
Beta Was this translation helpful? Give feedback.
All reactions