You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently experiencing an isue with using PolyOrigin for imputation of a panel of F1 offspring of an incomplete diallel of 18 tetraploid potato breeding clones.
The panel consists of 768 F1 clones that have been genotyped by sequencing. I have filtered the data based on quality and read depth and have a dataset of roughly 105,000 biallelic SNPs with 17 % missing data that I would like to impute. I have the phased genotyped of the 18 parents for all of the positions and use these as input for the imputation. I have tried using both the impute_LA function of the polyBreedR R-wrapper for PolyOrigin in a Windows setup (64 GB) and PolyOrigin itself in a linux environment (75 threads, 375 GB). However, when I run the program in either setup (I run each chromosome separately), I reach a Stackoverflow error at some point. The memory monitor indicated that the memory usage did not increase beyond 126 GB during the run, so it does not seem limited by memory.
I have tested different data reductions, and I can run the imputation for single families (there are 119 in total, with 1-14 offspring in each) for all SNPs in all chromosomes, and that runs successfully in my windows setup using the polyBreedR wrapper. I can also run the program across all families, but that requires that I reduce the set to 2000 SNPs.
However, due to the complex family structure of the 18-parent diallel, performing the imputation for subsets of full-sibs does not capture the population structure between half-sibs successfully (the similarity of full-sibs compared to half-sibs seems overestimated).
As linkage groups are disrupted by reducing the marker density beyond single chromsomes (that is around 9000 SNPs) and the pedigree is disrupted by using full-sibs alone, I cannot reduce my data further.
Can anything be done to overcome the stackoverflow in PolyOrigin when using data with high dimensionality (both population size, no. parents, and high SNP density)?
I have also raised this issue on the PolyOrigin github (#13)
Sincerely, Trine
The text was updated successfully, but these errors were encountered:
Trine, thank you for raising this issue. I checked with the developer of PolyOrigin, Dr. Chaozhi Zheng, and he is working to address it with a future release.
Hello,
I am currently experiencing an isue with using PolyOrigin for imputation of a panel of F1 offspring of an incomplete diallel of 18 tetraploid potato breeding clones.
The panel consists of 768 F1 clones that have been genotyped by sequencing. I have filtered the data based on quality and read depth and have a dataset of roughly 105,000 biallelic SNPs with 17 % missing data that I would like to impute. I have the phased genotyped of the 18 parents for all of the positions and use these as input for the imputation. I have tried using both the impute_LA function of the polyBreedR R-wrapper for PolyOrigin in a Windows setup (64 GB) and PolyOrigin itself in a linux environment (75 threads, 375 GB). However, when I run the program in either setup (I run each chromosome separately), I reach a Stackoverflow error at some point. The memory monitor indicated that the memory usage did not increase beyond 126 GB during the run, so it does not seem limited by memory.
I have tested different data reductions, and I can run the imputation for single families (there are 119 in total, with 1-14 offspring in each) for all SNPs in all chromosomes, and that runs successfully in my windows setup using the polyBreedR wrapper. I can also run the program across all families, but that requires that I reduce the set to 2000 SNPs.
However, due to the complex family structure of the 18-parent diallel, performing the imputation for subsets of full-sibs does not capture the population structure between half-sibs successfully (the similarity of full-sibs compared to half-sibs seems overestimated).
As linkage groups are disrupted by reducing the marker density beyond single chromsomes (that is around 9000 SNPs) and the pedigree is disrupted by using full-sibs alone, I cannot reduce my data further.
Can anything be done to overcome the stackoverflow in PolyOrigin when using data with high dimensionality (both population size, no. parents, and high SNP density)?
I have also raised this issue on the PolyOrigin github (#13)
Sincerely, Trine
The text was updated successfully, but these errors were encountered: