Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge_single_cells fails on non-canonical compartment merging #266

Open
shntnu opened this issue Apr 5, 2023 · 0 comments
Open

merge_single_cells fails on non-canonical compartment merging #266

shntnu opened this issue Apr 5, 2023 · 0 comments

Comments

@shntnu
Copy link
Member

shntnu commented Apr 5, 2023

We now must reckon with this warning raised by a chunk in test_cells.py (see details at the end of this comment)

 pycytominer/pycytominer/cyto_utils/cells.py:755: FutureWarning: Passing 'suffixes' which cause duplicate columns {'ObjectNumber_cytoplasm'} in the result is deprecated and will raise a MergeError in a future version.

In #194 (comment) I had said

It's hard for me to tell if this is a logic error. We can't exclude the possibility that it is a logic error, but (1) it might just be a bad test fixture (2) it is very likely some edge case.

After looking at the test closely, I am reasonably sure this is a logic error because the test fixture (AP_NEW) seems perfectly fine.

Because this is blocking #257, I will need to yank it out into a new test (test_merge_single_cells_non_canonical) and skip it.

I first notice this here:

def test_merge_single_cells_non_canonical():
    new_sc_merge_df = AP_NEW.merge_single_cells()

    assert sum(new_sc_merge_df.columns.str.startswith("New")) == 4
    assert (
        NEW_COMPARTMENT_DF.ObjectNumber.tolist()[::-1]
        == new_sc_merge_df.Metadata_ObjectNumber_new.tolist()
    )

    norm_new_method_df = AP_NEW.merge_single_cells(
        single_cell_normalize=True,
        normalize_args={
            "method": "standardize",
            "samples": "all",
            "features": "infer",
        },
    )

    norm_new_method_no_feature_infer_df = AP_NEW.merge_single_cells(
        single_cell_normalize=True,
        normalize_args={
            "method": "standardize",
            "samples": "all",
        },
    )

    default_feature_infer_df = AP_NEW.merge_single_cells(single_cell_normalize=True)

    pd.testing.assert_frame_equal(
        norm_new_method_df, default_feature_infer_df, check_dtype=False
    )
    pd.testing.assert_frame_equal(
        norm_new_method_df, norm_new_method_no_feature_infer_df
    )

    new_compartment_cols = infer_cp_features(
        NEW_COMPARTMENT_DF, compartments=AP_NEW.compartments
    )
    traditional_norm_df = normalize(
        AP_NEW.image_df.merge(NEW_COMPARTMENT_DF, on=AP.merge_cols),
        features=new_compartment_cols,
        samples="all",
        method="standardize",
    )

    pd.testing.assert_frame_equal(
        norm_new_method_df.loc[:, new_compartment_cols].abs().describe(),
        traditional_norm_df.loc[:, new_compartment_cols].abs().describe(),
    )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant