Can hub paper #2379

canergen · 2024-01-05T20:27:43Z

Adds minified mode to CondSCVI. Simplifies minification for new models. Adds add_latent_posterior mode which keeps count data. Fixes glitches in DE function.

Tests added and passed if fixing a bug or adding a new feature
All code checks passed
Added type annotations to new arguments/methods/functions
Added an entry in the latest docs/release_notes/index.md file if fixing a bug or adding a new feature

codecov · 2024-01-05T20:35:54Z

Codecov Report

Attention: Patch coverage is 92.94118% with 6 lines in your changes missing coverage. Please review.

Project coverage is 88.94%. Comparing base (9b5f553) to head (0178d45).
Report is 150 commits behind head on main.

Files	Patch %	Lines
scvi/model/base/_base_model.py	88.00%	3 Missing ⚠️
scvi/module/_vaec.py	92.30%	2 Missing ⚠️
scvi/model/_scanvi.py	75.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2379      +/-   ##
==========================================
- Coverage   89.42%   88.94%   -0.48%     
==========================================
  Files         153      153              
  Lines       12566    12566              
==========================================
- Hits        11237    11177      -60     
- Misses       1329     1389      +60

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

martinkim0 · 2024-01-10T23:50:32Z

If possible, could you separate this out into 4 PRs:

Minified mode for CondSCVI
Simplification of minification for new models
add_latent_posterior mode
Fixes in differential expression

Thanks

martinkim0

Thanks, LGTM overall! Just one major comment about switching the new minified mode to a new argument and some minor comments about performance/storage.

Since there's a lot of conflicts w/ the main branch, I can take care of transferring this to another PR and tagging you as co-author.

martinkim0 · 2024-05-28T17:57:03Z

scvi/model/base/_base_model.py

+    def minify_adata(
+        self,
+        minified_data_type: MinifiedDataType = ADATA_MINIFY_TYPE.LATENT_POSTERIOR,
+        use_latent_qzm_key: str = "X_latent_qzm",
+        use_latent_qzv_key: str = "X_latent_qzv",
+    ):
+        """Minifies the model's adata.
+
+        Minifies the adata, and registers new anndata fields: latent qzm, latent qzv, adata uns
+        containing minified-adata type, and library size.
+        This also sets the appropriate property on the module to indicate that the adata is minified.
+
+        Parameters
+        ----------
+        minified_data_type
+            How to minify the data. Currently only supports `latent_posterior_parameters` and `add_posterior_parameters`,.
+            If minified_data_type == `latent_posterior_parameters`:
+
+            * the original count data is removed (`adata.X`, adata.raw, and any layers)
+            * the parameters of the latent representation of the original data is stored
+            * everything else is left untouched
+            If minified_data_type == `add_posterior_parameters`:
+
+            * the original count data is kept (`adata.X`, adata.raw, and any layers)
+            * the parameters of the latent representation of the original data is stored
+            * everything else is left untouched
+        use_latent_qzm_key
+            Key to use in `adata.obsm` where the latent qzm params are stored
+        use_latent_qzv_key
+            Key to use in `adata.obsm` where the latent qzv params are stored
+
+        Notes
+        -----
+        The modification is not done inplace -- instead the model is assigned a new (minified)
+        version of the adata.
+        """


From my understanding, the only difference between "latent_posterior_parameters" and "add_posterior_parameters" is that the count data is preserved in the latter.

I'm thinking it might be less confusing for the end user if, instead of specifying a minified_data_type, we can directly let them pass in keep_count_data to distinguish between these two modes. What do you think?

Sounds good. Will be cleaner. Inititally the idea was to have more modes. However, I can't imagine a different mode.

martinkim0 · 2024-05-28T17:58:22Z

scvi/model/base/_base_model.py

+        version of the adata.
+        """
+        # TODO(adamgayoso): Add support for a scenario where we want to cache the latent posterior
+        if not ADATA_MINIFY_TYPE.__contains__(minified_data_type):


What's the reason this part uses __contains__ instead of a minified_data_type in ADATA_MINIFY_TYPE? I think the latter is more readable and should work with the NamedTuple.

martinkim0 · 2024-05-28T18:00:36Z

scvi/model/base/_base_model.py

@@ -481,6 +490,103 @@ def _check_if_trained(
            else:
                raise RuntimeError(message)

+    def minify_adata(


If the plan is to put this method under BaseModelClass, we should also move over the other methods under BaseMinifiedModeModelClass and then delete it. What do you think?

martinkim0 · 2024-05-28T18:01:29Z

scvi/model/utils/_minification.py

+        all_zeros = csr_matrix(adata.X.shape)
+        layers = {layer: all_zeros.copy() for layer in adata.layers}
+    else:
+        all_zeros = adata.X


Thoughts on checking whether adata.X is a sparse matrix and if not, sparsify it (in the spirit of minification)?

martinkim0 · 2024-05-28T18:02:21Z

scvi/model/utils/_minification.py

+        layers = {layer: all_zeros.copy() for layer in adata.layers}
+    else:
+        all_zeros = adata.X
+        layers = adata.layers


It may also make sense to only copy the count layer registered with the manager so we don't have to carry around the unused layers as part of the minified data.

canergen added 4 commits December 31, 2023 01:03

Change setup for minified data add condscvi

058fdca

Merge branch 'main' of github.com:scverse/scvi-tools into main

7e0badb

scvi-hub fixes.

a8ee394

Pre-commit

422c313

canergen requested a review from martinkim0 January 5, 2024 20:27

martinkim0 added this to the scvi-tools 1.1.0 milestone Jan 22, 2024

Merge branch 'main' into can_hub_paper

0178d45

martinkim0 removed this from the scvi-tools 1.1.0 milestone Jan 29, 2024

martinkim0 reviewed May 28, 2024

View reviewed changes

martinkim0 added this to the scvi-tools 1.2.0 milestone May 28, 2024

martinkim0 removed this from the scvi-tools 1.2.0 milestone Jun 21, 2024

martinkim0 added the P0 label Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can hub paper #2379

Can hub paper #2379

canergen commented Jan 5, 2024

codecov bot commented Jan 5, 2024 •

edited

Loading

martinkim0 commented Jan 10, 2024

martinkim0 left a comment

martinkim0 May 28, 2024

canergen May 29, 2024

martinkim0 May 28, 2024

martinkim0 May 28, 2024

martinkim0 May 28, 2024

martinkim0 May 28, 2024

Can hub paper #2379

Are you sure you want to change the base?

Can hub paper #2379

Conversation

canergen commented Jan 5, 2024

codecov bot commented Jan 5, 2024 • edited Loading

Codecov Report

martinkim0 commented Jan 10, 2024

martinkim0 left a comment

Choose a reason for hiding this comment

martinkim0 May 28, 2024

Choose a reason for hiding this comment

canergen May 29, 2024

Choose a reason for hiding this comment

martinkim0 May 28, 2024

Choose a reason for hiding this comment

martinkim0 May 28, 2024

Choose a reason for hiding this comment

martinkim0 May 28, 2024

Choose a reason for hiding this comment

martinkim0 May 28, 2024

Choose a reason for hiding this comment

codecov bot commented Jan 5, 2024 •

edited

Loading