Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCR matching error #3

Open
swluo1 opened this issue Feb 19, 2024 · 7 comments
Open

TCR matching error #3

swluo1 opened this issue Feb 19, 2024 · 7 comments

Comments

@swluo1
Copy link

swluo1 commented Feb 19, 2024

When I run 5p10XTCR with the example data TCR3.fastq.gz on my MacOS system, I got the error:
"Traceback (most recent call last):
File "/Users/Home/nanoranger/pipeline.py", line 236, in
utils.process_matching_5p10XTCR(sample,outdir)
File "/Users/Home/nanoranger/utils.py", line 733, in process_matching_5p10XTCR
scores=sort_cnt(all_AS[all_AS[:,1]==0][:,0])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed"

@EdGreen21
Copy link

I got same error in linux when I made an error in specifying the input fastq.gz file - are you sure the path is correct?

@mehdiborji
Copy link
Owner

@swluo1 Thank you for trying out nanoranger!

Could you check if the file {sample}_matching.sam exists and contains alignments? Alternatively you can look at footprint of STAR aligner and see it has produced any error.

I have observed this error when the {sample}_matching.sam file is empty (no alignments) and the array of alignment scores is, as a consequence, also empty.

This can happen on MacOS because the function for reading the {sample}_BCUMI.fasta.gz is set to be zcat (file scripts/barcode_align.sh, line --readFilesCommand zcat) which is not available by default on MacOS systems. You may change this to --readFilesCommand gunzip -c.

@swluo1
Copy link
Author

swluo1 commented Feb 20, 2024

@mehdiborji Thanks, I tried to replace zcat with gunzip -c or gzip -d, but still got this error. The {sample}_matching.sam file is not empty, I attached it here
TCR_matching.sam.zip.

Following is the log:

nanoranger packages loaded

alignment to transcriptome reference and defusing/deconcatenation

cores = 8
ref = /Users/Home/nanoranger/data/TR_V_human.fa
infile= /Users/Home/nanoranger/sample_fastq/TCR3.fastq.gz
outdir = TCR
sample = TCR
[M::mm_idx_gen::0.0021.98] collected minimizers
[M::mm_idx_gen::0.005
5.07] sorted minimizers
[M::main::0.0055.01] loaded/built the index for 106 target sequence(s)
[M::mm_mapopt_update::0.005
4.78] mid_occ = 10
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 106
[M::mm_idx_stat::0.0054.59] distinct minimizers: 6920 (90.45% are singletons); average occurrences: 1.150; average spacing: 5.626; total length: 44774
[M::worker_pipeline::0.091
4.64] mapped 4000 sequences
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -aY --eqx -x map-ont -t 8 --secondary=no --sam-hit-only /Users/Home/nanoranger/data/TR_V_human.fa /Users/Home/nanoranger/sample_fastq/TCR3.fastq.gz
[M::main] Real time: 0.093 sec; CPU: 0.426 sec; Peak RSS: 0.012 GB
filename = TCR/TCR_deconcat.fastq.gz
save_prefix = TCR/TCR
species = hsa
nthreads = 8
Alignment: 78.2%
Alignment: 100% ETA: 00:00:00
============= Report ==============
Analysis time: 1.51s
Total sequencing reads: 2800
Successfully aligned reads: 2153 (76.89%)
Alignment failed, no hits (not TCR/IG?): 32 (1.14%)
Alignment failed because of absence of J hits: 609 (21.75%)
Alignment failed because of low total score: 6 (0.21%)
Overlapped: 0 (0%)
Overlapped and aligned: 0 (0%)
Alignment-aided overlaps: 0 (NaN%)
Overlapped and not aligned: 0 (0%)
TRA chains: 1051 (48.82%)
TRB chains: 1102 (51.18%)
Realigned with forced non-floating bound: 0 (0%)
Realigned with forced non-floating right bound in left read: 0 (0%)
Realigned with forced non-floating left bound in right read: 0 (0%)
Initialization: progress unknown
Writing clones: 0%
============= Report ==============
Analysis time: 855.00ms
Final clonotype count: 283
Average number of reads per clonotype: 3.17
Reads used in clonotypes, percent of total: 897 (32.04%)
Reads used in clonotypes before clustering, percent of total: 897 (32.04%)
Number of reads used as a core, percent of used: 856 (95.43%)
Mapped low quality reads, percent of used: 41 (4.57%)
Reads clustered in PCR error correction, percent of used: 37 (4.12%)
Reads pre-clustered due to the similar VJC-lists, percent of used: 0 (0%)
Reads dropped due to the lack of a clone sequence, percent of total: 54 (1.93%)
Reads dropped due to low quality, percent of total: 67 (2.39%)
Reads dropped due to failed mapping, percent of total: 1135 (40.54%)
Reads dropped with low quality clones, percent of total: 0 (0%)
Clonotypes eliminated by PCR error correction: 19
Clonotypes dropped as low quality: 0
Clonotypes pre-clustered due to the similar VJC-lists: 0
TRA chains: 156 (55.12%)
TRB chains: 127 (44.88%)
Exporting clones: 0%
Initialization: progress unknown
Preparing for sorting: progress unknown
============= Report ==============
Analysis time: 1.35s
Final clonotype count: 283
Average number of reads per clonotype: 3.17
Reads used in clonotypes, percent of total: 897 (32.04%)
Reads used in clonotypes before clustering, percent of total: 897 (32.04%)
Number of reads used as a core, percent of used: 856 (95.43%)
Mapped low quality reads, percent of used: 41 (4.57%)
Reads clustered in PCR error correction, percent of used: 37 (4.12%)
Reads pre-clustered due to the similar VJC-lists, percent of used: 0 (0%)
Reads dropped due to the lack of a clone sequence, percent of total: 54 (1.93%)
Reads dropped due to low quality, percent of total: 67 (2.39%)
Reads dropped due to failed mapping, percent of total: 1135 (40.54%)
Reads dropped with low quality clones, percent of total: 0 (0%)
Clonotypes eliminated by PCR error correction: 19
Clonotypes dropped as low quality: 0
Clonotypes pre-clustered due to the similar VJC-lists: 0
TRA chains: 156 (55.12%)
TRB chains: 127 (44.88%)
TCR/TCR_bcreads.fasta
TCR/TCR_ref/
Feb 20 17:07:15 ..... started STAR run
Feb 20 17:07:15 ... starting to generate Genome files
Feb 20 17:07:19 ... starting to sort Suffix Array. This may take a long time...
Feb 20 17:07:19 ... sorting Suffix Array chunks and saving them to disk...
Feb 20 17:07:24 ... loading chunks from disk, packing SA...
Feb 20 17:07:24 ... finished generating suffix array
Feb 20 17:07:24 ... generating Suffix Array index
Feb 20 17:07:24 ... completed Suffix Array index
Feb 20 17:07:24 ... writing Genome to disk ...
Feb 20 17:07:24 ... writing Suffix Array to disk ...
Feb 20 17:07:24 ... writing SAindex to disk
Feb 20 17:07:24 ..... finished successfully
TCR/TCR_BCUMI.fasta.gz
TCR/TCR_ref/
TCR/TCR_matching
Feb 20 17:07:24 ..... started STAR run
Feb 20 17:07:24 ..... loading genome
Feb 20 17:07:27 ..... started mapping
Feb 20 17:07:27 ..... finished successfully

generate clone-barcode-UMI table

clone filtering finished
number of short UMI reads = 0
Traceback (most recent call last):
File "/Users/Home/nanoranger/pipeline.py", line 236, in
utils.process_matching_5p10XTCR(sample,outdir)
File "/Users/Home/nanoranger/utils.py", line 733, in process_matching_5p10XTCR
scores=sort_cnt(all_AS[all_AS[:,1]==0][:,0])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

@mehdiborji
Copy link
Owner

@swluo1 thank you for sharing this information. The sam file you shared is indeed empty in the sense it has no alignments as samtools view -c TCR_matching.sam returns 0.

The contents in the file are just simply the headers of reference sequences (here the 737k 10x5' barcodes).

Can you also share the BCUMI.fastq.gz file you get from the previous step?

I still believe the STAR aligner is not performing the alignment for some reason. What version of STAR you have installed?

@mehdiborji
Copy link
Owner

@swluo1 I am realizing the version of STAR you are running is most likely a very old and potentially problematic one (by looking at the sam header it seems it is 2.5.2b which dates back to 2016). I highly recommend you update to the latest version of STAR and try it again.

@EdGreen21
Copy link

I found the most recent version of star to work with nanoranger to be 2.7.9a

@mehdiborji
Copy link
Owner

@EdGreen21 that is indeed the version I initially used to develop this tool so I cannot say for certain how much backward compatibility there will be from that version. I have updated my STAR to 2.7.11b recently and there seems to be forward compatibility :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants