-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in hclust(parallelDist::parDist(t(CNA_mtx), threads = par_cores, : size cannot be NA nor exceed 65536 #69
Comments
Hi @Ilarius, |
It's ovarian cancer: first sample has 71634 initial cells and the second one 73644. That's because I only load a matrix with cells with at least 200 features otherwise I have to allocate a 1Tb vector in R! In the end I used the final filtered matrix (more or less 10k each) and the same code worked. I get that the cells that I thought to be more likely tumoral (given some markers) are enriched in cells found as "tumoral" by your algorithm. However, also a significant proportion of blood cells (which is a minority compared to the overall cells in the experiment and should not be aneuploid) is also detected as tumoral, and this makes the results less reliable. Do you think using filtered matrix could have generated this problem? How important is to start with the unfiltered matrix? |
I believe that using the filtered matrix is the correct procedure , to check for incorrectly classified cells you can view the heatmap to see if the separation was done correctly. Some errors can sometimes be caused by cells with noisier signal. You can improve the final result by passing SCEVAN more cells on which you are confident are normal cells as a parameter norm_cells . Regards |
I did not use norm cells because the documentation says: "norm_cells : Vector of normal cells if the classification is already known and you are only interested in the clonal structure (optional)". So I know that since it is a solid tumor the tumoral cells are in the epithelial cluster, and not in the blood cell clusters. PS. Is there somewhere the code that you use for the heatmaps and other visualization that you show in this vignette? |
If you know cells in the count matrix for which you are confident that are normal cells you can pass It as norm_cells parameter, It will be used to create e reference and identify all diploid cells. All code is public you can find in this GitHub. |
@Ilarius just a random idea while reading through this. What about using your cell annotations ( blood cells ) as a source of "normal" cells. May be set a seed and randomely draw 2-3k cells? You mention that most likely these cells should not be cancerous? Going even further perhaps only selecting blood cells with in certain cell cycle phase and/or low expressing genes particular to the cancer type u are looking for? |
@AntonioDeFalco Hi it does'nt look like the function multiSampleComparisonClonalCN have the option to pass |
Hello, if I try to run this in parallel on a cluster with slurm I get a *** caught bus error ***, even if I give enough memory.
I tried with just one core but i get the following error:
any cues?
The text was updated successfully, but these errors were encountered: