-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom database questions #45
Comments
I've also notice you don't accept duplications in the recommended format? i.e. |
Hi @mbhall88 , Sorry I need to update the documentation. You are right in using
Yes at the moment it is only a subset, which it accepts. The pipeline uses snpEff to annotate variants in new samples and only represents the variants in one way (e.g. c.643dupC instead c.643dup). To simplify the variant looup step the create_db function tried to standardise all variants to the snpEff format using regex, but currently I've only added support for the variants that are Thanks for raising the issue! |
Thanks for the clarification. Trying to support all of HGVS would likely be difficult, and would likely require developing a library. I just noticed https://github.com/biocommons/hgvs though! I haven't used it before, but looks like it might make your life a little easier potentially? Anyways, I got a custom db working and just thought this issue might be helpful just for some docs changes. Thanks for the quick response. |
Oh I hadn't seen that before, I'll check it out thanks! |
I'm having some issues trying to create a custom database.
My understanding from the documentation is that I clone this repo, and then replace/change the
tbdb.csv
file to have the mutations I want, then I runparse_db.py
in the main directory?It seems there is a file missing? And I can't find it documented anywhere
I then instead tried running the following from the tbdb main directory
this completes successfully, but I have a further issue with the output of this.
As per the docs, the mutations must follow HGVS nomenclature. But it seems tb-profiler only accepts a subset of this nomenclature.
For example, I have the mutation
c.196_198delinsTAG
, which describes an MNP at position 196TCG>TAG
. Looking at thetbdb.conversion.log
this (incorrectly) gets converted asAre you able to clarify (here and in the docs) what subset you support?
The text was updated successfully, but these errors were encountered: