Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Spherize more extensively #127

Closed
shntnu opened this issue Mar 10, 2021 · 5 comments
Closed

Test Spherize more extensively #127

shntnu opened this issue Mar 10, 2021 · 5 comments

Comments

@shntnu
Copy link
Member

shntnu commented Mar 10, 2021

Spherize only tests for the default epsilon

def test_spherize():
spherize_methods = ["PCA", "ZCA", "PCA-cor", "ZCA-cor"]
for method in spherize_methods:
for center in [True, False]:
scaler = Spherize(method=method, center=center)
scaler = scaler.fit(data_df)
transform_df = scaler.transform(data_df)
# The transfomed data is expected to have uncorrelated samples
result = (
pd.DataFrame(np.cov(np.transpose(transform_df)))
.abs()
.round()
.sum()
.clip(1) # necessary for when center == False (numerically unstable)
.sum()
)
expected_result = data_df.shape[1]
assert int(result) == expected_result

@shntnu
Copy link
Member Author

shntnu commented Mar 10, 2021

Hm – actually I don't think the test is correct
https://colab.research.google.com/drive/1KitZF-CgV_xgZpd1n0BieQpko-uD4ZQA?usp=sharing

@gwaybio
Copy link
Member

gwaybio commented Mar 17, 2021

i don't have any idea what that collab notebook means!

@shntnu - this is quickly becoming a pretty critical piece in the lincs cell painting dataset. I'd like to solve it once and for all. see broadinstitute/lincs-cell-painting#60.

Does the notebook mean the test needs to be fixed? Or that the test is showing that the spherize function is incorrect in some way? Does the notebook propose a new test?

@gwaybio
Copy link
Member

gwaybio commented Mar 17, 2021

In #132, I added a spherize_epsilon parameter. When we close this issue, we can address this variable here as well.

@shntnu
Copy link
Member Author

shntnu commented Mar 18, 2021

Sorry for leaving this stray comment without an explanation :D

The notebook shows that this (covariance) matrix will pass the test even though it is not spherized.

image

But as we discussed, you needn't do more here. The code looks right!

@shntnu shntnu closed this as completed Mar 18, 2021
@shntnu
Copy link
Member Author

shntnu commented Mar 18, 2021

Oh I just noticed this #127 (comment)

nvm – feel free to reopen or use a new issue, as you see fit @gwaygenomics

gwaybio added a commit to gwaybio/pycytominer that referenced this issue Mar 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants