-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for generalized UniFrac (again) #528
base: master
Are you sure you want to change the base?
Conversation
@fangly I haven't seen a lot of users of G-UniFrac, which might explain why the package is orphaned. That is not to say that distance itself isn't useful, and so I'm not opposed to supporting it in phyloseq, but there is a bandwidth issue. Ideally we would add the necessary abstraction/tweak to the planned fast-unifrac update. However, I don't want to make this an additional constraint, because the deadline for package updates for the next BioC release is in two weeks or so. Finally, the arbitrary format changes complicate versioning and my comprehension of what is being changed in the PR, so I can't simply merge the PR at the moment. Let's circle back after the fast-unifrac update, and see how we can work this in. Thanks for the contribution and enthusiasm for phyloseq. Hopefully we can get this supported soon. joey |
I sometimes use GUnifrac. True the package is orphaned, but only sort of, Jun Chen moved to Minnesota where he and a student wrote a new package (and paper about) including a faster version of GUnifrac. MiSPU is available on CRAN and GitHub (https://github.com/ChongWu-Biostat). Paper at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4872356/ |
Hi Joey,
Thanks for the constructing feedback. Here is my response:
1/ Many files were affected because I ran formatR on all of them to clean-up the uneven mixture of tabs and spaces in code indentations. I have now reverted these changes, except for the distance calculation file. It may be a good idea to run formatR yourself, though...
2/ I made GUniFrac a suggested dependency, as you suggested.
3/ You rightly point out that it would be better to implement the generalized UniFrac calculation in phyloseq instead of calling an external package. I do agree. However, I think that my commit adds functionality to phyloseq as-is. However, it seems that GUniFrac is an orphaned package, and I would suggest that you take its distance calculation code almost as-is (using appropriate licensing and acknowledgements, of course), inform the authors (out of courtesy), and ultimately try to integrate it nicely into phyloseq. As you see, this is both a political decision and technical. You're obviously in charge of the politics in this project. On the technical side, I do not have a good-enough understanding of the algorithm to do it myself.
Cheers,
Florent