Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to write out the filtered features for troubleshooting #13

Open
mestaki opened this issue Feb 13, 2025 · 2 comments
Open

Option to write out the filtered features for troubleshooting #13

mestaki opened this issue Feb 13, 2025 · 2 comments

Comments

@mestaki
Copy link

mestaki commented Feb 13, 2025

Hey @wasade!
Not sure if you prefer these here or on the Q2 forum, happy to reproduce there if preferred.

I'm processing some V4 data from porcine fecal samples, with EMP primers on a 2x250 Illumina run. I decided to merge them with DADA2 rather than trim forward reads to 90, 100, or 150nt, so I opted to use the non-v4-16s pipeline. I was surprised to see a loss of nearly 20% total reads, went from having 3,842 unique features across 5,165,563 reads, to, 1,148 unique features across 4,218,390 total reads. Losing that many features is typically ok in my experience as long as total reads loss is only a couple of %s. In this case I wanted to dig in deeper to see what was being tossed out but am having trouble identifying the filtered reads easily. With the regular v4 pipeline it is easier since I can map the ASVs easily, but not sure how to do this with the clustering approach. For example, how do I map my ASVs to the new feature ID RS-GCF-014287855.1-NZ-JACOOV010000039.1 ?
This got me thinking that an optional parameter to save the filtered reads during non-v4-16s would be super useful in these troubleshooting scenarios.

Thanks!

@wasade
Copy link
Member

wasade commented Feb 13, 2025

Hi @BoD,

We're blocked by upstream q2-vsearch on this, see qiime2/q2-vsearch#93. Alternatively, the vsearch command used could be pulled from code and run directly in order to obtain the .uc data

Best,
Daniel

@mestaki
Copy link
Author

mestaki commented Feb 15, 2025

Hi @wasade,

Got it! Thanks for the quick reply. Looks like --uc would be the target call to change to enable writing this out. I'll see if I can poke around to make something work, would be my first go playing around with q2 plugins so expectations should be low lol.

By the way, I wish I had claimed the BoD handle, but a Benoit Lubek beat me to the punch there:P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants