-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running compass with data from multiple samples #18
Comments
Hi, Thanks for your interest in COMPASS! |
Hi and thanks for a quick reply! The problem I'm experiencing is that I have two (at least almost) clonal mutations, which then exist in all cancer cells (set FREQ=0), roughly 1,000 cells. Then I have roughly 1600 normal cells. I also throw in ~60 germline variants (FREQ=1) to support CNVs. What happens after some manipulation of the parameters is that the germline events end up at the root with very few cells, followed by a node with the two mutations and very few cells, and below that a branch with CNLOH where those mutations are lost - here the big lump of cells end up, including the normals. I understand how the algorithm can find this appealing, but it is just not right :). There is also another CNLOH that I want captured, which it seems to do fine with. Since I have done my fair share of c++ coding, I think I can modify the algorithm by sending in information about both the cells and the events, to more clearly separate the handling of germline and somatic variants (by penalizing them differently) and also penalize the cells differently. I think this can just be added to Tree::compute_prior_score, would that make sense? Thanks for the help - I'll create a fork and see what I can do! |
Hi,
I'm experimenting with running COMPASS (using CNVs) with multiple samples (3 samples), where the cells then come from different runs with MissionBio. A problem is that they got sequenced at different depth, the difference is pretty large, which means that copy number counts for regions vary a lot between samples (and hence cells). I looked at the code, I couldn't see that you normalize them per cell or anything (correct me if I'm wrong) - should I normalize the data somehow before sending it in? One obvious thing would just be to normalize the copy numbers across samples, and potentially across amplicons if that is a problem. Would you recommend doing something like that?
Another question: From the MissionBio protein, I know which cells are normal, and which are likely malignant. Can I supply that information somehow? I also know which variants are somatic and which are germline - is it enough to set FREQ to 0 for the somatic and 1 for the germline? The germline are there to support the CNV.
The text was updated successfully, but these errors were encountered: