-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3' Shifting #551
Comments
Hi, we implement the following algorithm based on the genome sequence and not the transcript sequence. https://genome.sph.umich.edu/wiki/Variant_Normalization I'd suggest writing down the genome sequence flanking left and right of this (e.g., via Does this help? |
Hi @holtgrewe, I'm looking into this issue with @lshimp. Thanks for the response. We are familiar with the Please, notice we are talking about this functionality as described in Jannovar docs
More exactly, we found that this variant returns this output from Jannovar 0.36 using the rest-server:
[{
"transcriptId": "NM_000059.4",
"variantEffects": ["splice_donor_variant", "coding_transcript_intron_variant"],
"isCoding": true,
"hgvsProtein": "p.?",
"hgvsNucleotides": "c.316+1del"
}] Notice how Jannovar is 3' shifting all the way into the intron. However, we found this other very similar variant returns this output instead:
[{
"transcriptId": "NM_000051.4",
"variantEffects": ["frameshift_truncation", "splice_region_variant"],
"isCoding": true,
"hgvsProtein": "p.(C536Ffs*7)",
"hgvsNucleotides": "c.1607del"
}, {
"transcriptId": "NM_001351834.2",
"variantEffects": ["frameshift_truncation", "splice_region_variant"],
"isCoding": true,
"hgvsProtein": "p.(C536Ffs*7)",
"hgvsNucleotides": "c.1607del"
}] As you can see, in this case Jannovar opted to not fully 3' shift into the intron. So, our question is why is the 3' shifting feature choosing not to go in to the intronic sequence for the second variant? |
Hi thank you for the indepth look. I'll put this on my agenda in the next days. |
Much appreciated. I didn't say, but we opted to build the database instead of downloading the ones provided. Nothing crazy, we are using all the refseq files you would expect. We just wanted to pin it to a specific build version to match the rest of our environment. Sharing in case that could make a difference when trying to reproduce. ; HG19 from RefSeq
[hg19/refseq]
type=refseq
alias=MT,M,chrM
allowNonCodingNm=true
preferPARTranscriptsOnChrX=true
chromInfo=http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/chromInfo.txt.gz
chrToAccessions=https://ftp.ncbi.nlm.nih.gov/genomes/archive/old_refseq/Homo_sapiens/ARCHIVE/ANNOTATION_RELEASE.105/Assembled_chromosomes/chr_accessions_GRCh37.p13
gff=https://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Homo_sapiens/annotation_releases/105.20201022/GCF_000001405.25_GRCh37.p13/GCF_000001405.25_GRCh37.p13_genomic.gff.gz
rna=https://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Homo_sapiens/annotation_releases/105.20201022/GCF_000001405.25_GRCh37.p13/GCF_000001405.25_GRCh37.p13_rna.fna.gz
faMT=https://www.ncbi.nlm.nih.gov/sviewer/viewer.cgi?save=file&db=nuccore&report=fasta&id=251831106 |
I had to adjust the download paths for gff and rna
Also, I had to fix the downloaded chromInfo.txt.gz and remove the CRS sequence entry chrMT from chromInfo.txt.gz (ucsc added the rCRS at some point to hg19). Jannovar Here is how to do it on the command line with and without 3' shifting.
So it looks like 3' shifting is not properly hooked up, adding #552/#553 to fix this. Now we get proper output when disabling shifting.
The next thing is to bow to the hypnotoad^H variantvalidator.org. It is the most reliable tool and the Berlin genetics department's SOP for exome analysis actually says that variants identified should get their final description from VariantValidator.
So this confirms your bug report. Let's dive a bit deeper. The relevant code is here. // Shift the GenomeChange if lies within precisely one exon.
if (so.liesInExon(change.getGenomeInterval())) {
try {
// normalize amino acid change and add information about this into {@link messages}
this.change = GenomeVariantNormalizer.normalizeGenomeChange(transcript, change,
projector.genomeToTranscriptPos(change.getGenomePos()));
if (!change.equals(this.change))
messages.add(AnnotationMessage.INFO_REALIGN_3_PRIME);
} catch (ProjectionException e) {
throw new Error("Bug: change begin position must be on transcript.");
}
} else {
this.change = change;
} It might not be obvious from these lines, but the code 3'-shifts the variant based on the transcript sequence. Jannovar does not have access to the genome sequence itself so our assumption always was that users perform genome-wise vt-like 3' shifting and the code above from In other words, the code in For my use cases, this never has been a problem as I run all my VCFs through What do you think? If you agree with my reasoning my next step would be to create a Github issue for adding an (optional) argument to provide a FAI-indexed FASTA file to |
I might be getting something wrong, but I don't think I had mentioned to our team a reason why Jannovar couldn't do the 3' shifting outside of the exon sequence could be because of the lack of full reference sequence. I agree I don't see any other option but to make that available. We do wonder how is Jannovar able to fully 3' shift |
You don't get anything wrong, I do. Let me draft up a couple of tests and then a fix to the code so we can decide based on actual examples. Thanks! |
Hi, @holtgrewe, I work on @CarlosBorroto's team and had some follow up information to share as well as a question. Before contacting you, we were trying to investigate any differences between these two variants to determine why one variant is 3' shifted into the intron while the other variant is not. We noticed a difference in the first nucleotide of the next coding exon. I am unsure if this is related, but I wanted to share in case it was helpful. Variant Variant We found two other variants that follow this pattern, though this finding can be completely coincidental. However, it did remind us of an HGVS rule that I now wonder whether Jannovar implements. The rule is:
https://varnomen.hgvs.org/recommendations/DNA/variant/deletion/ Does Jannovar implement the 3' shifting exception to prevent a variant from shifting into the next exon? I found a variant that is pushed into the downstream exon, but if I am understanding the HGVS exception correctly, this should not occur. This variant is left aligned to The result of
From the Jannovar cdot output, you can see this variant is annotated in the codon of the next exon: If I am understanding correctly, then I don't expect the variant to be pushed into the downstream exon. If you agree, I'm happy to open an independent issue, but wanted to start the conversation here in case these events are actually related. |
@dmb107 thank you for reporting. I think your understanding is right. Sadly, Jannovar does not implement the rule yet. I think this is a good place to collect the 3' shifting related issues. |
Thank you for the quick responses! We very much appreciate it. |
Jannovar applied 3' shifting and classified the following variants as splice_donor_variant.
However, the Variantvalidator classified as splice_region_variant according to HGVSc (no +1 or +2). I believed that this is related to the problem of 3' shifting as well. |
Hello,
I’m sorry if this is not the right place, but I have a question regarding Jannovar and 3’ shifting implementation. We have two variants that are very similar to each other but Jannovar 3’ shifts one and not the other. The variants in question:
Assembly: GRCh37 Assembly: GRCh37
Chromosome: 13 Chromosome: 11
Ref: AG Ref: TG
Start: 328934621 Start: 108121798
Alt: A Alt: T
Stop: 32893462 Stop: 108121799
Both variants delete a ‘G’ which is the last base of the exon. The first base of the intron is also a ‘G’, so we are expecting both variants to be 3’ shifted into the intron.
The first variant, Jannovar annotates as shifting into the intronic region with the following variant effects:
{ "splice_donor_variant", "coding_transcript_intron_variant"}
The second variant, Jannovar annotates as remaining in the exon with the following variant effects:
{ "frameshift_truncation", "splice_region_variant" }
Both variants are at the very end of the exon, but Jannovar only annotates one as 3’ shifting.
We are wondering how Jannovar is annotating these variants and why only one is 3’ shifted?
Thanks!
The text was updated successfully, but these errors were encountered: