-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Search function to return all transcript mappings spanned by a query reference region #5
Comments
Hi John,
|
@mashok-acog Sorry for not replying to your other bug but, not only is it unclear what you mean in that bug, compared to this much clearer post, but I am also currently not full time on this project. If you need further help on this issue please move back to the original bug fill in the extra detail and '@' me. This bug is a feature request, it is not associated with your problem, please do not reply in this thread. You probably just need to install the vvta (and it's own Seqrepo release) instead of the uta and it should work fine. The "current_valid_mapped_transcript_spans_mv" view is one of the first views used by the when searching for any relevant transcripts with an input chromosomal location, so this complaint is characteristic of missing/ outdated or misconfigured database. As noted at the top of the readme however this project is mainly being used by the VariantValidator pipeline, and is not recommended for stand alone use. In some respects this represents a snapshot of an older hgvs version, though upgraded to work with the newer vvta database. This is required to work with the existing VariantValidator code base, which then tweaks the output to improve it. If you need a end user recommended project you should normally either use VariantValidator or mainline hgvs, as such the documentation has not been updated for this project as a stand alone system. The actual documentation to install this project is here |
Feature request description, and associated problem
vv_hgvs is currently the main interface for VVTA databases and is used by VariantValidator for this purpose. Users expect to be able to query VariantValidator
for genomic variants, and receive as a response all affected transcripts. However, despite their expectations this is not the case, as a consistent hgvs nomenclature for handling variants beyond the bounds of the transcript has yet to be decided on by the HVNC, and as such the vv_hgvs so far lacks features for querying mapped transcripts in these cases. This has caused issues in variantValidator such as https://github.com/openvar/variantValidator/issues/399. As such it would be good to add a query that allows users to detect such transcripts to help fulfil these expectations.
Current proposed solution
vv_hgvs already has a number of related functions, adding a similar one to handle this case should be reasonably straightforward. The underlying SQL should look something like
SELECT * FROM current_valid_mapped_transcript_spans_mv WHERE alt_ac=$target_acc AND start_i >$query_start AND end_i < $query_end
for total overlap orSELECT * FROM current_valid_mapped_transcript_spans_mv WHERE alt_ac=$target_acc AND end_i>$query_start AND start_i < $query_end
. Relevant tests will also need to be added.Alternatives
It is possible that we could just expect users to query the VVTA directly, but this would complicate the usage of the VVTA by breaking through the expected layering.
Additional context
We need to decide, and specify, whether the spans are exclusive or inclusive, document which, and test for this as well.
The text was updated successfully, but these errors were encountered: