Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Invalid contig name 'chr14_GL000194v1_random' for reference 'GRCh38' #193

Open
leldershaw opened this issue May 14, 2021 · 2 comments

Comments

@leldershaw
Copy link

Vaxrank fails when it tries to find the contig 'chr14_GL000194v1_random' in the reference GRCh38, despite this being a valid contig name.

Command run:
vaxrank
--vcf /home/ubuntu/data/Sample_07/Sample_07_tumor_v_Sample_07_normal.combine_variants.phased.annotated.vcf
--genome hg38
--download-reference-genome-data
--bam /home/ubuntu/data/Sample_07-T-RNA/07-T-RNA_S13_R1_001.fastq.gz.subread.sorted.BAM
--mhc-predictor netmhc
--mhc-alleles HLA-A01:01,HLA-B08:01,HLA-C*07:01
--mhc-epitope-lengths 9
--padding-around-mutation 5
--vaccine-peptide-length 17
--output-ascii-report Mel5-vaccine-peptides-report.txt

Error output:
2021-05-13 15:37:10,781 - isovar.allele_reads:199 - INFO - Gathering reads for Variant(contig='Y', start=56875044, ref='T', alt='A', reference_na$
e='GRCh38')
2021-05-13 15:37:10,782 - isovar.allele_reads:203 - INFO - Gathering variant reads for variant Variant(contig='Y', start=56875044, ref='T', alt='$
', reference_name='GRCh38') (chromosome = chrY, gene names = [])
2021-05-13 15:37:10,816 - isovar.locus_reads:312 - INFO - Found 0 reads overlapping locus chrY: 56875043-56875045
2021-05-13 15:37:10,820 - isovar.translation:466 - INFO - No supporting reads for variant Variant(contig='Y', start=56875044, ref='T', alt='A', re
ference_name='GRCh38')
2021-05-13 15:37:10,822 - vaxrank.core_logic:246 - INFO - No protein sequences for Variant(contig='Y', start=56875044, ref='T', alt='A', reference
_name='GRCh38')
2021-05-13 15:37:10,822 - isovar.allele_reads:199 - INFO - Gathering reads for Variant(contig='chr14_GL000194v1_random', start=53456, ref='C', alt
='A', reference_name='GRCh38')
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/bin/vaxrank", line 8, in
sys.exit(main())
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/vaxrank/cli.py", line 389, in main
data = ranked_variant_list_with_metadata(args)
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/vaxrank/cli.py", line 314, in ranked_variant_list_with_metadata
variants_count_dict = core_logic.variant_counts()
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/vaxrank/core_logic.py", line 331, in variant_counts
if variant in self.isovar_protein_sequence_dict:
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/vaxrank/core_logic.py", line 243, in isovar_protein_sequence_dict
for variant, isovar_protein_sequences in protein_sequences_generator:
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/isovar/protein_sequences.py", line 255, in reads_generator_to_protein_sequences_generat
or
for (variant, overlapping_reads) in variant_and_overlapping_reads_generator:
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/isovar/allele_reads.py", line 275, in reads_overlapping_variants
allele_reads = reads_overlapping_variant(
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/isovar/allele_reads.py", line 207, in reads_overlapping_variant
variant.gene_names)
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/varcode/variant.py", line 435, in gene_names
self._check_that_genome_has_contig()
File "/home/ubuntu/anaconda3/lib/python3.8/site-packages/varcode/variant.py", line 370, in _check_that_genome_has_contig
raise ValueError("Invalid contig name '%s' for reference '%s'" % (
ValueError: Invalid contig name 'chr14_GL000194v1_random' for reference 'GRCh38'

@doctorchenzx
Copy link

Hi, I met the same problem with you, have you find a resolution?

@iskandr
Copy link
Contributor

iskandr commented May 24, 2023

This is an alt contig and probably the underlying annotation tools (PyEnsembl + Varcode) don't support it.

I think the only immediate solution is to filter out alt contig variants or do alignment against canonical chromosomes only. I'll try to think more of the best path forward though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants