-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrappers for hap.py and dipcall #294
Comments
Hi @nate-d-olson,
It looks like version 0.3.12 of hap.py fixed the regex issue but we only got that version into Bioconda a couple of months after its release and Snakemake wrapper creation. Cheers, |
There is another fix needed upstream, see Illumina/hap.py#113 |
Ok, first PR is merged, hap.py should now work fine for you. |
@mbargull has a look at dipcall now. |
Thanks for your responsiveness and for working to update the hap.py bioconda package and hap.py/pre.py wrapper. We use the main hap.py function, which does the comparison and calculates performance metrics, in our pipeline. I tried to run the main hap.py command,
|
They should really have a test framework. Yes, the main has to be adapted as well. |
Here is my first attempt at making a wrapper for the main hap.py command. nate-d-olson@34e7de2 I had a few questions regarding recommendations/ best practices for creating wrappers.
|
Thanks for asking, I've added comments on your commit. |
@jafors and @johanneskoester thank you for your help with the hap.py bioconda packages and snakemake-wrapper. Now that we have a dipcall bioconda package I wanted to start working on the dipcall snakemake-wrapper. Running dipcall is a two-step process; first
|
So, I'd say, if the makefile generation is super quick, just put both in one wrapper. If it takes time, while being less parallel than the second step, use two wrappers, and ideally add a meta-wrapper for their combination. |
This issue was marked as stale because it has been open for 6 months with no activity. |
Is your feature request related to a problem? Please describe.
Difficulty developing an assembly benchmarking pipeline due to the dependencies associated with two key bioinformatic tools use in the pipeline hap.py and dipcall. I know there is a snakemake wrapper for the hap.py pre.py command but we are running into a regex issue when running the hap.py bioconda package in the snakemake/snakemake docker container.
Describe the solution you'd like
We would like to have snakemake wrappers for hap.py, https://github.com/Illumina/hap.py, and dipcall, https://github.com/lh3/dipcall.
Describe alternatives you've considered
Tried using docker containers to handle dependencies but we were unable to get the pipeline to work on both macOS and Linux. Tried using the hap.py bioconda package but ran into a regex error when run in the snakemake/snakemake container. There is an issue on the hap.py github repo related to the error, Illumina/hap.py#66, which seems to be related to specific gcc versions.
Additional context
As part of the Genome in a Bottle project, https://www.nist.gov/programs-projects/genome-bottle we are working to improve the usability and interpretability of our methods for benchmarking variant calls. Hap.py is used for small variant benchmarking and is the reference implementation of the GA4GH best practices for small variant benchmarking. dipcall is used to generate variant calls from a diploid assembly. We developed a proof of concept pipeline using snakemake, https://github.com/usnistgov/giab-asm-benchmarking, but the pipeline only works on macOS due to a hack we used to handle the hap.py and dipcall dependencies.
The text was updated successfully, but these errors were encountered: