-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gene-level module membership score #11
Comments
If this just means to calculate a score (or several ones) for each gene, given the modules currently found by the implemented methods, then it seems OK for me. On the contrary, if the idea is to modify the algorithms to make them provide those scores, then this is much more difficult and deviates from the purpose of MONET: implementing the best-performing methods of the DREAM challenge. Thus, I endorse the first option and discourage the second one. Anyway, I believe the optimal approach to obtain membership scores is to use community detection methods that are specifically designed to do so, which means using completely different algorithms. I mean, I'm not sure about the quality of the scores we could add on top of the current K1, M1 and R1. |
Thanks for opening this issue. I'm the user who inquired about the feature's availability. I absolutely agree that if the feature doesn't exist in the original algorithms, it would be a mistake to try to shoehorn it in. Because the methods all use the same similarity matrix as input, I'm going to try calculating a post hoc score as the mean of similarities for each gene to other genes in its module (possibly geometric mean; I'll have to look at the distributions). If that works out well, and you decide you want to include it as a feature, I'll be happy to pass along the relevant R code. |
I've implemented a membership calculation for our data. We're using topological overlap matrices (TOMs) as used in WGCNA for our similarity matrices. Here's what I wrote in the README: Membership functions are calculated as follows. For each dataset, and for each gene in each cluster, a gene's raw membership is defined as the mean of its TOM value with all genes in the dataset including itself, where a gene's TOM with itself is always 1. For example, supppose a gene is in a cluster with three other genes with which its TOM values are 0.1, 0.05, and 0.01. Then its raw membership score in the cluster is (1 + 0.1 + 0.05 + 0.01) / 4 = 0.29. One implication of this is that memberships in singleton clusters, i.e. clusters containing only one gene, are always 1. Because TOM values are often very low, membership values can be quite low too. Thus the final membership score is calculated by dividing the raw memberships across by their maximum value across the entire dataset, excluding singletons. (Singleton values are then set back to 1.) This helps bring the scores in line with those seen in other cluster membership measures such as kME in WGCNA or posterior probabilities in mixture models. If this sounds like something you'd like to include as an option in MONET, let me know and I'll post the code. :) |
Thank you, Daniel! |
Currently, the methods produce "hard" clustering results, i.e. binary assignment of genes to modules.
POTENTIAL FEATURE: optionally output some kind of module membership score (like the correlation-based kME score for WGCNA or the posterior prob of a mixture model)
The text was updated successfully, but these errors were encountered: