st.spatial.SME.SME_normalize: ValueError: Input X contains NaN. #310

liangyuli12138 · 2024-09-09T06:42:33Z

Hi, tks for developing this useful tool. I encounter some question when I use the SME_normalize function.

Here is my code

#read data
adata=sc.read_h5ad(Datapath)
count_matrix = adata.X
spatial = adata.obs[['x','y']]
spatial.rename(columns={'x':'imagerow','y':'imagecol'},inplace=True)
data=st.create_stlearn(count=count_matrix,spatial=spatial,library_id=f'{sample}', scale=1,background_color="white")
data.obs['array_row']=spatial.iloc[:,0]
data.obs['array_col']=spatial.iloc[:,1]
data.var_names_make_unique()
data.layers['raw_count']=data.X
#tile data
TILE_PATH=Path(os.path.join(outDir,'{0}_tile'.format(sample)))
TILE_PATH.mkdir(parents=True,exist_ok=True)

#tile morphology
st.pp.tiling(data,TILE_PATH,crop_size=40)
st.pp.extract_feature(data)

###process data
st.pp.normalize_total(data)
st.pp.log1p(data)

#gene pca dimention reduction
st.em.run_pca(data,n_comps=50,random_state=0)

#stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data, use_data="raw",weights =  "weights_matrix_gd_md")

I always get this error:

Extract feature: 100%|████████████████████████████████████ [ time left: 00:00 ]
Traceback (most recent call last):
File "/share/home/bgi_lily/Stu/Data/code/Rusedtile.py", line 57, in
ME_normalize(Datapath=adata,outDir=outdir,sample=sample)
File "/share/home/bgi_lily/Stu/Data/code/Rusedtile.py", line 41, in ME_normalize
st.spatial.SME.SME_normalize(data, use_data="raw",weights = "weights_matrix_gd_md")
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/stlearn/spatials/SME/normalize.py", line 60, in SME_normalize
calculate_weight_matrix(adata, platform=platform)
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/stlearn/spatials/SME/_weighting_matrix.py", line 48, in calculate_weight_matrix
reg_row = LinearRegression().fit(array_row.values.reshape(-1, 1), img_row)
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/base.py", line 1152, in wrapper
return fit_method(estimator, *args, **kwargs)
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/linear_model/_base.py", line 678, in fit
X, y = self._validate_data(
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/base.py", line 622, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/utils/validation.py", line 1146, in check_X_y
X = check_array(
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/utils/validation.py", line 957, in check_array
_assert_all_finite(
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/utils/validation.py", line 122, in _assert_all_finite
_assert_all_finite_element_wise(
File "/share/appspace_data/shared_groups/usersenv/stlearn/lib/python3.8/site-packages/sklearn/utils/validation.py", line 171, in _assert_all_finite_element_wise
raise ValueError(msg_err)
ValueError: Input X contains NaN.
LinearRegression does not accept missing values encoded as NaN natively. For supervised learning, you might want to consider sklearn.ensemble.HistGradientBoostingClassifier and Regressor which accept missing values encoded as NaNs natively. Alternatively, it is possible to preprocess the data, for instance by using an imputer transformer in a pipeline or drop samples with missing values. See https://scikit-learn.org/stable/modules/impute.html You can find a list of all estimators that handle NaN values at the following page: https://scikit-learn.org/stable/modules/impute.html#estimators-that-handle-nan-values

#================================================================================
And I checked my data ,it does not have NAN value

>>> np.any(np.isnan(data.X.toarray()))
False
>>> np.all(np.isfinite(data.X.toarray()))
True
# same as count_matrix
>>> np.isnan(count_matrix.toarray()).any()
False
>>> np.isfinite(count_matrix.toarray()).any()
True

Now, I don't know what should i do. I'd really appreciate it if you could help me with this problem

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

st.spatial.SME.SME_normalize: ValueError: Input X contains NaN. #310

st.spatial.SME.SME_normalize: ValueError: Input X contains NaN. #310

liangyuli12138 commented Sep 9, 2024

st.spatial.SME.SME_normalize: ValueError: Input X contains NaN. #310

st.spatial.SME.SME_normalize: ValueError: Input X contains NaN. #310

Comments

liangyuli12138 commented Sep 9, 2024