Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Names for forwards and reverse reads does not match. Cannot continue #42

Open
tseemann opened this issue Apr 18, 2019 · 10 comments
Open

Comments

@tseemann
Copy link

I am getting this error:

Names for forwards and reverse reads does not match. Cannot continue

My R1 file:

@NB551233:47:H5WGVAFXY:1:11101:24878:1062 1:N:0:CGAGGCTG+NCTCTAGG

My R2 file:

@NB551233:47:H5WGVAFXY:1:11101:24878:1062 2:N:0:CGAGGCTG+NCTCTAGG

CC: @aunderwo

@tseemann
Copy link
Author

tseemann commented Apr 18, 2019

The code is

if (options.read1.rsplit('_',1)[0] != options.read2.rsplit('_',1)[0]):
print('Names for forwards and reverse reads does not match. Cannot continue', file=sys.stderr)
sys.exit(1)

Is this operating on the read FILENAMES or the READ IDs ?

Mine are NNNN-NNNNN_S33_R2_001.fastq.gz and R2

@antunderwood
Copy link
Contributor

This operates on FILENAMES

Looking at the code as it is written, it has the assumption that read filenames are in the format

_{R1,R2,1,2}.fastq.gz

NNNN-NNNNN_S33_R1_001.fastq.gz and NNNN-NNNNN_S33_R2_001.fastq.gz will be split to NNNN-NNNNN_S33_R1 and NNNN-NNNNN_S33_R2 which don't match

A fudge solution would be to make softlinks that remove the _001 until the sanger-pathogens team have bandwidth to correct the code

@tseemann
Copy link
Author

Thanks once again @aunderwo

@tseemann
Copy link
Author

Argh. Our other system is to use SAMPLE/{R1,R2}.fq.gz but that fails too.
Names for forwards and reverse reads does not match. Cannot continue :(

@cimendes
Copy link

I was implementing a seroba component on flowcraft and getting this error when some particular components came before seroba. Thank you @tseemann and @aunderwo for your discussion! Without it I could not have figured it out! And @tseemann, I'm renaming all the input files to ${sample_id}_{1,2}.fq.gz to make it work. :)

Thanks!!!

@tseemann
Copy link
Author

I don't feel I should have to rename my files for a tool to work :(

@sreerampeela
Copy link

Even I have the same issue. I am using Docker image and renamed files as read_1.fq.gz and read_2.fq.gz as suggested by @cimendes. The issue isn't resolved. Any suggestions on how to resolve this?

@arif-tanmoy
Copy link

renaming the files to $name_1.fastq.gz and $name_2.fastq.gz worked.

@fgonzalez3
Copy link

I tried this, but still no luck. Has anybody come across another workaround?

renaming the files to $name_1.fastq.gz and $name_2.fastq.gz worked.

@arif-tanmoy
Copy link

I tried this, but still no luck. Has anybody come across another workaround?

renaming the files to $name_1.fastq.gz and $name_2.fastq.gz worked.

Hey @fgonzalez3 - I haven't used it in a while. I think I actually made changes in the Python code of Seroba to recognize the filenames we use. I will try to find it and share it here. Also, you can always try using other tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants