-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synthesize exons when they are not included in the input GFF3 #491
Comments
This should be fixed in PR #492 The example in the issue looks like: The code may be inefficient but the slow down doesn't seem noticeable. These are two replicates loading a small gff with 283 mRNAs:
Current:
|
Done in #492 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sometimes we see a GFF3 that does not explicitly state where the exons are, e.g.
In this case we need to synthesize the exons for our internal representations.
We can use the five_prime_UTR, three_prime_UTR, and CDS lines to figure out where the exons are. If a UTR and a CDS are adjacent, they should be combined into a single exon. Otherwise, each unique CDS location should get an exon with the same location.
This needs to be handles in packages/apollo-shared/src/GFF3/gff3ToAnnotationFeature.ts. We'll probably want to check after processedCDS are determined in that file if there are any exons, and then synthesize them at that point if not.
The text was updated successfully, but these errors were encountered: