-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/exe 1989 iterative leiden #188
Conversation
0602bb2
to
1e44d02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. I had some suggestions. Just let me know if you have any questions.
communities_graph = nx.from_edgelist(connected_communities) | ||
for cc in nx.connected_components(communities_graph): | ||
community_serie[community_serie.isin(cc)] = min(cc) | ||
return edgelist, community_serie | ||
|
||
|
||
def recover_technical_multiplets( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function could benefit from being broken up into some smaller private functions to help a bit with readability.
61caad4
to
744aa90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As reviewed live I give you mysuggestions, they are mostly documentation and remarks.
Co-authored-by: Alvaro Martinez Barrio <[email protected]>
Co-authored-by: Alvaro Martinez Barrio <[email protected]>
Co-authored-by: Alvaro Martinez Barrio <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks really good! Just some minor comments
@@ -32,7 +38,7 @@ def connect_components( | |||
sample_name: str, | |||
metrics_file: str, | |||
multiplet_recovery: bool, | |||
leiden_iterations: int = 10, | |||
max_refinement_recursion_depth: int = 5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 5 or 10 shouldn't make a huge difference. Perhaps it's a good idea to stick with this lower threshold of 5 for now and then we can evaluate whether this thresholds needs to be refined later.
…gelist-metrics-light removing components_modularity from edgelist_metrics
Co-authored-by: Alvaro Martinez Barrio <[email protected]>
… as an input to the graph step cli.
Description
To improve identification of potential technical multiplets, we are now implementing an iterative Leiden step where instead of a single leiden run on the entire sample, we run leiden iteratively. During each pass, if the subgraph is broken down into multiple communities the new communtities will be added to a queue to go through community detection again. This process goes on until a the subgraphs are no longer broken down into communities or a maximum depth given by "leiden_iterations" is reached.
In addition, the component names will now be generated using a hash of their pixels' barcodes and the generated components are deterministic. "components_modularity" is also now removed from edgelist_metrics as it is computationally expensive and the information is not frequently used.
Fixes: EXE-1989, EXE-1884, EXE-1979, EXE-2015
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
The existing unit tests pass. The new graph step is run on multiple samples confirming intended detection of potential multiplets.
PR checklist: