Question: Imputation of scores #7

lima1 · 2019-06-12T19:28:10Z

Hi,

I'm looking for a solution of a fairly straightforward problem: I have scores for all heterozygous SNPs in a pool of normals describing how the allelic fraction (not population allele frequency) deviates from the expected 0.5. There is also an error associated with each available position based on total coverage and number of samples with this SNP.

I currently have an ad hoc way of imputing a score of variants not in the pool of normal by averaging the scores of the n nearest neighbors, but a weighted running median would be better.

Sorry for the basic question, but is this something I can use GenomicScores for, or maybe make it work, maybe by including some fake data points?

Thanks in advance,
Markus

rcastelo · 2019-06-22T12:19:27Z

Hi Markus,
If I understand you correctly, we could incorporate the scores you have as an AnnotationHub resource available via 'getGScores()'. This is a manual process that requires parsing files and put them available in the proper format but once they are in place, then you can query those scores in an uniform way with the functions 'gscores()' and 'score()'. Is this what you were asking for?

Cheers,

robert.

lima1 · 2019-07-09T14:49:11Z

Hi Robert,

thanks for getting back to me and sorry for my late response.

Now it makes sense, I thought I missed something in the documentation about generating these data structures. Since these scores depend on many things, they would be unique to each user and their normal samples.

My question was: essentially now I have a custom GRanges with scores. Only a (small) fraction of the genome has scores associated, but I'd like to impute the scores for all requested ranges. Do you think GenomicScores is the right tool for this? Looks like not (yet?), right?

Markus

rcastelo · 2019-07-22T10:37:30Z

Hi,

GenomicScores currently has nothing like that but I guess it would not be that difficult to implement this feature and enable it with additional arguments to the call to 'gscores()' or 'score()', e.g., impute.method=c("none", "min", "max", "mean"), impute.distance=0L, so that every NA value could be imputed using one of the methods applied to the values observed within a physical distance expressed in bp. Is this what you are looking for?

lima1 · 2019-09-14T15:23:34Z

Hi Robert,

I'm currently benchmarking best ways of imputing the scores and get back to you. But that sounds perfect.

Markus

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Imputation of scores #7

Question: Imputation of scores #7

lima1 commented Jun 12, 2019

rcastelo commented Jun 22, 2019

lima1 commented Jul 9, 2019

rcastelo commented Jul 22, 2019

lima1 commented Sep 14, 2019

Question: Imputation of scores #7

Question: Imputation of scores #7

Comments

lima1 commented Jun 12, 2019

rcastelo commented Jun 22, 2019

lima1 commented Jul 9, 2019

rcastelo commented Jul 22, 2019

lima1 commented Sep 14, 2019