Masked data is returned as 0.0 after gridding, how can these pixels be identified if zeros exist in input data? #51

JSAnandEOS · 2019-07-04T11:50:03Z

So I'm using the "conservative_normed" algorithm provided in the "masking" branch of xESMF to grid some MODIS GPP data to a lower spatial resolution (fine to coarse). Being a land-only product, the ocean pixels are invalid and so need to be masked. After masking and running xESMF on the data these regions now appear as zeroes, as expected.

My problem is that valid zero values also exist in the input data over regions with no vegetation (e.g. deserts). Therefore, in the resulting array I can't readily tell which pixels are invalid, and which contain real data. How do I get around this? Is there any way to output a mask of which pixels contain real (i.e. no data was binned at all)? Thanks.

JiaweiZhuang · 2019-07-04T19:13:33Z

the ocean pixels are invalid and so need to be masked.

in the resulting array I can't readily tell which pixels are invalid, and which contain real data. How do I get around this?

Do you mean that the input data (on source grid) are all NaNs cover the ocean region? In that case, the output data will also be NaNs over the ocean, by default. You don't need to apply additional masking. In many cases, "masking" just means "setting NaN to zeros" (#22 (comment)), which might not be what you actually want.

If you input data do not even cover the ocean region (i.e. a regional grid only over land), but the output grid is global, then the undefined ocean region will have zeros instead of NaNs, by default. To flip this behavior see #15 (comment).

JSAnandEOS · 2019-07-04T22:27:10Z

Do you mean that the input data (on source grid) are all NaNs cover the ocean region?

In addition to the ocean, there are also certain areas where for whatever reason (say, cloud cover) the data is invalid, so these regions have to be removed from the gridding as well. I have currently set these to NaNs as well. These are different to areas where the data is zero (e.g. deserts), because these values are still valid.

You don't need to apply additional masking. In many cases, "masking" just means "setting NaN to zeros" (#22 (comment)), which might not be what you actually want.

I had originally wanted to use conservative gridding with NaNs and zero values, but I encountered the same problem as #22, where large sections of coastal regions were missing in the final gridded dataset, despite having non-zero input data near those regions. The discussion about "conservative_normed" suggested that I needed to do both masking and setting unwanted areas to NaNs in order to deal with both coastal regions and areas with invalid data.

JiaweiZhuang · 2019-07-05T05:42:10Z

If I understand correctly, then you need to

Use "conservative_normed" with additional masks for NaN values, when building the regridder, just like what you did right now.
Then, after building the regridder, apply the trick at Value of cells in the new grid that are outside the old grid's domain #15 (comment) so that "real zeros" and "mask-generated zeros" can be distinguished.

Does this produce what you expected?

JSAnandEOS · 2019-07-08T14:57:32Z

If I understand you correctly, the regridding should be done like so:

import scipy
import xesmf as xe
import numpy as np

def add_matrix_NaNs(regridder):
    X = regridder.A
    M = scipy.sparse.csr_matrix(X)
    num_nonzeros = np.diff(M.indptr)
    M[num_nonzeros == 0, 0] = np.NaN
    regridder.A = scipy.sparse.coo_matrix(M)
    return regridder


def regrid(ds_in, ds_out, dr_in, method = 'conservative_normed'):
    regridder = xe.Regridder(ds_in, ds_out, method, periodic=True, reuse_weights=False)
    regridder = add_matrix_NaNs(regridder)
    dr_out = regridder(dr_in)
    regridder.clean_weight_file()
    return dr_out

Is this correct?

JiaweiZhuang · 2019-07-09T16:15:28Z

Yes this should mark undefined regions as NaNs while keeping real zeros untouched. However it is a very niche edge case, so I am not entirely sure if it is correct. Let me know if it works.

JSAnandEOS · 2019-08-08T11:59:53Z

I apologise for the late reply, but I am pleased to report that this solution works. Thanks!

JiaweiZhuang · 2019-08-08T16:24:07Z

Great! Just notice that 0.2.0 deprecates regridder.A in favor of regridder.weights (792e228)

I'd like to have a simpler option in the main branch to set different mask-handling behavior, to avoid this ad-hoc fix from users. But given the subtlety of masking, it probably requires more study. Not having a clear timeline right now.

Fix 34

JSAnandEOS closed this as completed Aug 8, 2019

JiaweiZhuang mentioned this issue Sep 11, 2019

regridding yields 0 when there should be no value #63

Closed

aulemahal pushed a commit to Ouranosinc/xESMF that referenced this issue Dec 4, 2020

Merge pull request JiaweiZhuang#51 from pangeo-data/fix-34

bb29347

Fix 34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Masked data is returned as 0.0 after gridding, how can these pixels be identified if zeros exist in input data? #51

Masked data is returned as 0.0 after gridding, how can these pixels be identified if zeros exist in input data? #51

JSAnandEOS commented Jul 4, 2019

JiaweiZhuang commented Jul 4, 2019

JSAnandEOS commented Jul 4, 2019

JiaweiZhuang commented Jul 5, 2019

JSAnandEOS commented Jul 8, 2019 •

edited

Loading

JiaweiZhuang commented Jul 9, 2019

JSAnandEOS commented Aug 8, 2019

JiaweiZhuang commented Aug 8, 2019

Masked data is returned as 0.0 after gridding, how can these pixels be identified if zeros exist in input data? #51

Masked data is returned as 0.0 after gridding, how can these pixels be identified if zeros exist in input data? #51

Comments

JSAnandEOS commented Jul 4, 2019

JiaweiZhuang commented Jul 4, 2019

JSAnandEOS commented Jul 4, 2019

JiaweiZhuang commented Jul 5, 2019

JSAnandEOS commented Jul 8, 2019 • edited Loading

JiaweiZhuang commented Jul 9, 2019

JSAnandEOS commented Aug 8, 2019

JiaweiZhuang commented Aug 8, 2019

JSAnandEOS commented Jul 8, 2019 •

edited

Loading