Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimate DZU, DZT #44

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Estimate DZU, DZT #44

wants to merge 15 commits into from

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Mar 27, 2020

  • What do we set DZT at points below KMT?: set to nominal dz
  • Need a test dataset that has DZT and DZU precomputed.

@andersy005
Copy link
Contributor

Need a test dataset that has DZT and DZU precomputed.

If I am not mistaken, the dataset used by @ALDepp in #42 appears to have DZT and DZU precomputed:

filepath = pop_tools.DATASETS.fetch('Pac_POP0.1_JRA_IAF_1993-12-6-test.nc')
ds = xr.open_dataset(filepath)

Screen Shot 2020-03-27 at 2 50 54 PM

@ALDepp
Copy link
Contributor

ALDepp commented Mar 27, 2020

The file Pac_POP0.1_JRA_IAF_1993-12-6-test.nc does not have DZU and DZT, those variables are added from a separate grid file.

@ALDepp
Copy link
Contributor

ALDepp commented Mar 27, 2020

2. What do we set DZT at points below KMT?

I would set them to zero, but it should not really matter as there should not be any model output below KMT - the model does not compute anything there, as it would be in the ocean floor.

@andersy005
Copy link
Contributor

The file Pac_POP0.1_JRA_IAF_1993-12-6-test.nc does not have DZU and DZT, those variables are added from a separate grid file.

😯my bad! I missed what followed after opening Pac_POP0.1_JRA_IAF_1993-12-6-test.nc. Thank you for the clarification!

# open sample data
filepath = pop_tools.DATASETS.fetch('Pac_POP0.1_JRA_IAF_1993-12-6-test.nc')
ds = xr.open_dataset(filepath)

# get DZU and DZT, needed for operations later on
filepath_g = pop_tools.DATASETS.fetch('Pac_grid_pbc_1301x305x62.tx01_62l.2013-07-13.nc')
ds_g = xr.open_dataset(filepath_g)

ds["DZT"] = ds_g.DZT
ds["DZU"] = ds_g.DZU

@dcherian dcherian changed the title First pass at calculating dzu, dzt [WIP] First pass at calculating dzu, dzt Mar 27, 2020
@dcherian
Copy link
Contributor Author

dcherian commented Apr 3, 2020

@andersy005 the test dataset is 5GB in netCDF form but 120MB in zarr (!)

It's on the CGD machines at /project/oce/dcherian/POPexampleXgcm/comp-grid.tx9.1v3.20170718.zarr.

Can we get zarr files to work with DATASETS.fetch?

@dcherian dcherian changed the title [WIP] First pass at calculating dzu, dzt Estimate DZU, DZT Apr 3, 2020
@dcherian
Copy link
Contributor Author

dcherian commented Apr 8, 2020

gentle ping @andersy005

@andersy005
Copy link
Contributor

@dcherian, Sorry for not getting back to you sooner (somehow I didn't get the original notification).

Can we get zarr files to work with DATASETS.fetch?

Yes. We will need to zip it on the ftp server, and during the fetch step, it will be decompressed. I will look into this today, and will get back to you when it's ready.

@andersy005
Copy link
Contributor

@dcherian, the zarr store is now on the FTP server. When you get a chance, merge master into your branch, and try the following:

In [2]: from pop_tools.datasets import DATASETS, UnzipZarr                  
                                               
In [4]: import xarray as xr                                                 

In [5]: zstore = DATASETS.fetch('comp-grid.tx9.1v3.20170718.zarr.zip', proce
   ...: ssor=UnzipZarr())                                                   

In [6]: ds = xr.open_zarr(zstore)                                           

In [7]: ds                                                                  
Out[7]: 
<xarray.Dataset>
Dimensions:  (nlat: 2400, nlon: 3600, z_t: 62)
Coordinates:
  * z_t      (z_t) float32 500.0 1500.0 2500.0 ... 537500.0 562499.06 587499.06
Dimensions without coordinates: nlat, nlon
Data variables:
    DZBC     (nlat, nlon) float32 dask.array<chunksize=(300, 900), meta=np.ndarray>
    DZT      (z_t, nlat, nlon) float64 dask.array<chunksize=(4, 300, 450), meta=np.ndarray>
    DZU      (z_t, nlat, nlon) float64 dask.array<chunksize=(4, 300, 450), meta=np.ndarray>
    KMT      (nlat, nlon) float64 dask.array<chunksize=(300, 900), meta=np.ndarray>
    KMU      (nlat, nlon) float64 dask.array<chunksize=(300, 900), meta=np.ndarray>
    dz       (z_t) float32 dask.array<chunksize=(62,), meta=np.ndarray>
Attributes:
    Conventions:                CF-1.0; http://www.cgd.ucar.edu/cms/eaton/net...
    NCO:                        netCDF Operators version 4.7.9 (Homepage = ht...
    calendar:                   All years have exactly  365 days.
    cell_methods:               cell_methods = time: mean ==> the variable va...
    contents:                   Diagnostic and Prognostic Variables
    history:                    Thu Apr  2 17:17:40 2020: ncks -D 3 dzt_dzu_g...
    history_of_appended_files:  Thu Apr  2 17:17:40 2020: Appended file dzt_d...

@dcherian
Copy link
Contributor Author

dcherian commented Apr 9, 2020

Very impressive @andersy005 ! That worked pretty easily.

@dcherian
Copy link
Contributor Author

dcherian commented Apr 9, 2020

Looks like we need to bump up time limit on the CI

@andersy005
Copy link
Contributor

Looks like we need to bump up time limit on the CI

I SSH-ed into the test container and ran the failing test line by line. Everything ran until assert_equal(dzt, ds['DZT']), which indicates that this is likely a memory issue.

In [1]: from pop_tools.datasets import UnzipZarr

In [2]: from pop_tools import DATASETS

In [3]: zstore = DATASETS.fetch('comp-grid.tx9.1v3.20170718.zarr.zip', processor
   ...: =UnzipZarr())

In [4]: import xarray as xr

In [5]: ds = xr.open_zarr(zstore)

In [6]: import pop_tools

In [7]: dzu, dzt = pop_tools.grid.calc_dzu_dzt(ds)
<frozen importlib._bootstrap>:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

In [8]: dzu, dzt
Out[8]:
(<xarray.DataArray 'DZT' (z_t: 62, nlat: 2400, nlon: 3600)>
 dask.array<where, shape=(62, 2400, 3600), dtype=float32, chunksize=(62, 300, 900), chunktype=numpy.ndarray>
 Coordinates:
   * z_t      (z_t) float32 500.0 1500.0 2500.0 ... 537500.0 562499.06 587499.06
 Dimensions without coordinates: nlat, nlon,
 <xarray.DataArray 'DZU' (z_t: 62, nlat: 2400, nlon: 3600)>
 dask.array<where, shape=(62, 2400, 3600), dtype=float32, chunksize=(62, 300, 900), chunktype=numpy.ndarray>
 Coordinates:
   * z_t      (z_t) float32 500.0 1500.0 2500.0 ... 537500.0 562499.06 587499.06
 Dimensions without coordinates: nlat, nlon)

In [9]: from xarray.testing import assert_equal

In [10]: assert_equal(dzu, ds['DZU'])
In [11]: assert_equal(dzt, ds['DZT'])
Killed

@dcherian
Copy link
Contributor Author

dcherian commented Apr 9, 2020

OK I'll try to come up with a workaround.

@bradyrx
Copy link
Contributor

bradyrx commented May 29, 2020

Just bumping this. It'll be helpful in coordination with #54 for me to start working on #40, which will ultimately get me back on track toward a clean implementation of #7 😄

pop_tools/grid.py Outdated Show resolved Hide resolved
pop_tools/grid.py Outdated Show resolved Hide resolved
@bradyrx
Copy link
Contributor

bradyrx commented May 30, 2020

Before merging, see #56. We should link these functions to get_grid such that we can have DZU, DZT as default grid output. However, we need DZBC which doesn't seem at all standard for POP output.

pop_tools/grid.py Outdated Show resolved Hide resolved
tests/test_grid.py Outdated Show resolved Hide resolved
pop_tools/grid.py Outdated Show resolved Hide resolved
@dcherian
Copy link
Contributor Author

Should make this bit from the tracer budget notebook redundant

ds['DZT'] = ds_dzt.DZT
ds['DZU'] = ds_dzu.DZU
ds.DZT.attrs['long_name'] = 'Thickness of T cells'
ds.DZT.attrs['units'] = 'centimeter'
ds.DZT.attrs['grid_loc'] = '3111'
ds.DZU.attrs['long_name'] = 'Thickness of U cells'
ds.DZU.attrs['units'] = 'centimeter'
ds.DZU.attrs['grid_loc'] = '3221'

# make sure we have the cell volumne for calculations
VOL = (ds.DZT * ds.DXT* ds.DYT).compute()
KMT = ds.KMT.compute()

for j in tqdm(range(len(KMT.nlat))):
    for i in range(len(KMT.nlon)):
        k = KMT.values[j,i].astype(int)
        VOL.values[k:,j,i] = 0.

ds['VOL']=VOL

ds.VOL.attrs['long_name'] = 'volume of T cells'
ds.VOL.attrs['units'] = 'centimeter^3'

ds.VOL.attrs['grid_loc'] = '3111'

@klindsay28
Copy link
Collaborator

for j in tqdm(range(len(KMT.nlat))):
for i in range(len(KMT.nlon)):
k = KMT.values[j,i].astype(int)
VOL.values[k:,j,i] = 0.

The following (untested) code replaces nested j,i loops with a single k loop.
It seems like it would be more cache friendly.

for k in range(VOL.shape[0]):
    VOL[k, :, :] = np.where(k < KMT, VOL[k, :, :], 0.0)

@dcherian
Copy link
Contributor Author

Thanks Keith, this PR doesn't address the VOL bit. I'll copy your suggestion to #14

@dcherian dcherian closed this Apr 5, 2021
@dcherian dcherian reopened this Apr 5, 2021
@xdev-bot xdev-bot closed this Apr 5, 2021
@xdev-bot xdev-bot reopened this Apr 5, 2021
@xdev-bot xdev-bot closed this Apr 5, 2021
@xdev-bot xdev-bot reopened this Apr 5, 2021
@dcherian
Copy link
Contributor Author

dcherian commented Apr 5, 2021

linting failure is some infrastructure thing.

[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check docstring is first.................................................Passed
Check JSON...........................................(no files to check)Skipped
Check Yaml...............................................................Passed
Fix double quoted strings................................................Passed
black....................................................................Passed
blackdoc.................................................................Passed
flake8...................................................................Passed
seed isort known_third_party.............................................Passed
isort....................................................................Passed
prettier.................................................................Passed
nbqa-black...............................................................Passed
nbqa-pyupgrade...........................................................Passed
nbqa-isort...............................................................Passed
Error: reserveCache failed: Cache already exists. Scope: refs/pull/44/merge, Key: pre-commit-2-74cc0ad84a16ba798fae779720b2bb7f287bfb9d3cb53e6e6d991fac1405b7d1-b29a2fe3609b690eff3505d1f491273363900ea708e56082a167bcdbe70b0456, Version: 28cdb9f5496f334116f23e86f0063f5d3a9348c2e22425a33171e071aadada7e

Copy link
Collaborator

@matt-long matt-long left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Deepak!


.. warning::

This function does not do the right thing at the tripole grid seam.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this out of scope? At this point, the tripole grid is the only one that we use partial bottom cells with.

I have treated the tripole seam in other instance where I am using roll like this:

# the tripole grid is periodic at the top: the left half of the top row maps to the 
# right half. If `ltripole == True`, I replace the bottom row of the 
# `KMT` array with the top row flipped left-to-right. 
kmt_rollable = ds.KMT.copy()
if ltripole:
    kmt_rollable[0, :] = kmt_rollable[-1, ::-1]

A similar treatment could be applied here, though would need to be implemented as compatible with your numba_4pt_min.

pop_tools/grid.py Show resolved Hide resolved
pop_tools/xgcm_util.py Show resolved Hide resolved
@DapengLi17
Copy link

Hi Sir/Lady:
I am computing 3D dzu and dzt (with partial bottom cell correction) for our CESM-POP2 tx0.1v2 simulations. Following the discussion above, I tried assert_equal(dzt, dzu), assert_equal(ds['DZU'], ds['DZT']) and both commands return no errors (indicating dzt and dzu are equal). According to the POP manual, dzu is the min surrounding dzt so I expect dzt and dzu to be different at the bottom cell. Would you mind letting me know why dzt and dzu are equal?
Thanks so much for your time and work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants