Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no filenames list built #59

Open
Qvdauwer opened this issue Sep 28, 2022 · 5 comments
Open

no filenames list built #59

Qvdauwer opened this issue Sep 28, 2022 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@Qvdauwer
Copy link

Dear James,

I am trying to demultiplex multiread fast5 files to be able to use the fast5 files per barcode. This tool looks perfect for this application, unfortunately I am unable to get it to work.

I tried running the following command:

python /opt/SquiggleKit/fast5_fetcher_multi.py -v
-q /home/qvdauwer/PhD/WP1/circulomics/Guppy6/reads/Circulomics_barcode5.fastq.gz 
-s /home/qvdauwer/PhD/WP1/circulomics/Guppy6/sequencing_summary.txt.gz -m /home/qvdauwer/Fast5 -o ./fast5_bc5_circulomics 

And even though it seems to pass the initial checks, there seems to be an error when it tries to get fast5 file names using seq_sum.

I added the verbose output:

Verbose mode active - dumping info to stderr
SquiggleKit fast5_fetcher: 1.3.0
args: Namespace(OSystem='Linux', f5_format='multi', fastq='/home/qvdauwer/PhD/WP1/circulomics/Guppy6/reads/Circulomics_barcode7.fastq', flat=None, index=None, multi_f5='/home/qvdauwer/Fast5', output='./fast5_bc7_circulomics', paf=None, pppp=False, prefix='trimmed', seq_sum='/home/qvdauwer/PhD/WP1/circulomics/Guppy6/sequencing_summary.txt', seq_sum_1D2=None, threshold=4000, trim=False, trim_list=None, verbose=True, version=False)
Multi-fast5 mode detected in mode: multi
Output folder './fast5_bc7_circulomics' created
Checks passed!
Starting things up!
Getting multi-fast5 info...
no filenames list built, check inputs
Extracting reads from multi-fast5 files...
No file paths built
done!

I looked at the other issues in this directory and my problem seems to be similar to the one in #55. So I guess there might be a naming problem somewhere but I did not manage to find what the exact issue is.

Here are the first few lines of the sequencing summary file if that helps
sequencing_summary_first_lines.txt

Thank you in advance for your help,
Quentin

@Psy-Fer
Copy link
Owner

Psy-Fer commented Sep 29, 2022

Hello Quentin,

I'll have a look into this.
Is the goal here to split your fast5 files into your barcode groups?

James

@Psy-Fer Psy-Fer self-assigned this Sep 29, 2022
@Psy-Fer
Copy link
Owner

Psy-Fer commented Sep 29, 2022

Ahh i think i found the problem.

So first, they changed the header in the sequencing summary file
from
filename_fast5 read_id to filename read_id`

So the column detection code doesn't work.

Second, i made an error in the matching code.

Let me fix both of these.

@Psy-Fer Psy-Fer added the bug Something isn't working label Sep 29, 2022
@Psy-Fer
Copy link
Owner

Psy-Fer commented Sep 29, 2022

Okay, if you can do a git pull in the repo, and try again?

Potential fix made adbc52e

James

@Psy-Fer
Copy link
Owner

Psy-Fer commented Sep 29, 2022

Also, I should probably mention, that another way to do this is to convert the files to slow5 using slow5tools, and then it is quite simple to extract the readIDs from the fastq file for each barcode, and use them to extract the slow5 records into their own slow5 file. Then if you wish to stay in fast5, you can convert back again.

slow5tools has a lot more mature checks and up to date handling of fast5 files and nanopore data than squigglekit (and the new version of squigglekit i'm working on will primarily be built upon slow5. So just another option if this tool doesn't provide all the features you are looking for.

James

@Qvdauwer
Copy link
Author

Qvdauwer commented Oct 3, 2022

Hey James,

My goal is indeed to split my fast5 files by barcode. I tried the updated version and it seems to be working fine.

Also, I will take a look at the slow5tools you recommended as an alternate option.

Thank you for your help and the additional advice.

Quentin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants