-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Imputation of scores #7
Comments
Hi Markus, Cheers, robert. |
Hi Robert, thanks for getting back to me and sorry for my late response. Now it makes sense, I thought I missed something in the documentation about generating these data structures. Since these scores depend on many things, they would be unique to each user and their normal samples. My question was: essentially now I have a custom GRanges with scores. Only a (small) fraction of the genome has scores associated, but I'd like to impute the scores for all requested ranges. Do you think GenomicScores is the right tool for this? Looks like not (yet?), right? Markus |
Hi, GenomicScores currently has nothing like that but I guess it would not be that difficult to implement this feature and enable it with additional arguments to the call to 'gscores()' or 'score()', e.g., impute.method=c("none", "min", "max", "mean"), impute.distance=0L, so that every NA value could be imputed using one of the methods applied to the values observed within a physical distance expressed in bp. Is this what you are looking for? |
Hi Robert, I'm currently benchmarking best ways of imputing the scores and get back to you. But that sounds perfect. Markus |
Hi,
I'm looking for a solution of a fairly straightforward problem: I have scores for all heterozygous SNPs in a pool of normals describing how the allelic fraction (not population allele frequency) deviates from the expected 0.5. There is also an error associated with each available position based on total coverage and number of samples with this SNP.
I currently have an ad hoc way of imputing a score of variants not in the pool of normal by averaging the scores of the n nearest neighbors, but a weighted running median would be better.
Sorry for the basic question, but is this something I can use GenomicScores for, or maybe make it work, maybe by including some fake data points?
Thanks in advance,
Markus
The text was updated successfully, but these errors were encountered: