Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genotype_sv Aggregate model has less Output SVs than Input SVs #152

Open
ghost opened this issue Jul 23, 2024 · 1 comment
Open

Genotype_sv Aggregate model has less Output SVs than Input SVs #152

ghost opened this issue Jul 23, 2024 · 1 comment

Comments

@ghost
Copy link

ghost commented Jul 23, 2024

Good afternoon,

I'm writing this to report an issue I've been having while trying to do a test run for graphtyper's genotype_sv command.

I ran 2 sv-callers: Manta and Smoove on 50 samples and then merged their results with Jasmine_sv (similarly to svimmer, maintains the original caller's output information for each variant).

After this, I ended up with a VCF file containing approximately 130,000 structural variants.

I then ran the following command on graphtyper:

graphtyper genotype_sv Homo_sapiens_assembly38_HLA2.fasta \ /path/to/jasmine_merged.vcf \ --output=/path/to/50_samples_test\ --region_file= /path/to/file/containing/contigs_of_interest.txt\ --sams=/path/to/reheadered_bams.txt \ --verbose

After this I took the resulting 6468 VCF files and merged them together using bcftools concat to create a final merged VCF output.
Only, the output had only 161,000 structural variants, which is odd since there were multiple records for most of the variants (due to the SVMODEL info field). When filtering for only those with the AGGREGATED model, I ended up with only 66,000 structural variants.

My question is: Does graphtyper carry out some sort of filtering or merging of variants if it considers them to be the same variant when they might not be? Why did my number of structural variants decrease by almost 50% of their original amount? Could it be that there were variants in the original VCF file that overlapped and Graphtyper simply removed these?

@hannespetur
Copy link
Member

Are there are no log messages that say some SVs were skipped?

If not, you could change try put the flag --force_no_filter_zero_qual and see if they appear then.

Best, Hannes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant