Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saw you started some prelim ONT methods #98

Open
MycoMap opened this issue Jun 16, 2022 · 2 comments
Open

Saw you started some prelim ONT methods #98

MycoMap opened this issue Jun 16, 2022 · 2 comments

Comments

@MycoMap
Copy link

MycoMap commented Jun 16, 2022

I have some sets of dual indexed fungal ITS amplicon pools from specimens. One is 288 specimens and the other is 480 specimens if they would be helpful at all in developing methods.

Could also discuss my current workflow that seems to work reasonably well.
[email protected]

@nextgenusfs
Copy link
Owner

Yeah -- there is some basic/rudimentary processing in the codebase already -- so demultiplexing the reads is easy and already done. The most difficult part will be dealing with the error rates and de novo clustering, ie with ONT simplex reads at say 96% accurate -- you have a hard time splitting those raw data into appropriate bins to create OTUs. I tried several things awhile ago and wasn't too happy with the data, although I had old ONT data so that is also part of the issue. Newer data, ie from the LSK112/R10.4 setup is much better single read accuracy. I also tried some clustering with isONClust that works sort of okay (its built for clustering transcript data), but its still quite difficult to sort through the noisy reads and get reliable clusters/OTUs. Perhaps generating duplex reads would have high enough accuracy where you could just cluster with something like uclust/vsearch.

What sequencing kits did you run these with? If you know what should be in these samples (ie high quality Sanger data for every specimen)then yes would help immensely in trying to figure out a de novo approach.

@hyphaltip
Copy link
Contributor

PeterKennedy showed those early ONT results at MSA22 that you had worked on @nextgenusfs - def limited value for the older amplicons, but we also discussed with others doing PacBio on amplicons getting really good results. I think newer data def worth a look.

Also Ryan Wick's twitter post on doing short-read assembly with ONT also demonstrates how the accuracy is improved in R10.4 https://twitter.com/rrwick/status/1548926644085108738

It might be worth looking at a different clustering approach since error model for usearch/vsearch might not be able to really model the ONT error as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants