-
Notifications
You must be signed in to change notification settings - Fork 93
Installation
Prerequisites include:
- python
- docker
- python packages
- wheel
- setuptools
- PyYAML
- cwlref-runner
- cwltool
You will need to install the prerequisites if they're not already installed on your system.
e.g.,
The instructions that follow use pip and virtualenv, which are usually included with most python installations, so try:
$ pip --version
$ virtualenv --version
If pip is not installed see https://pip.pypa.io/en/stable/installing/ for installation instructions.
Virtualenv can be easily installed with pip:
$ pip install virtualenv
To create a virtualenv for your installation of CWL and PGAP:
$ virtualenv --python=python3 cwl
$ source cwl/bin/activate
(cwl) $ pip install -U wheel setuptools
(cwl) $ pip install -U cwltool[deps] PyYAML cwlref-runner
Detailed instructions are found on the docker website, Docker Install. Please install the latest version of docker, it is usually newer than the one that comes with your distribution. Note that it requires root access to install, and the user who will be running the software will need to be in the docker group. The required docker containers and images will download automatically the first time the pipeline runs. Afterwards, they will be cached and subsequent runs will execute much faster.
Make sure that you're running Docker and that you are part of the group that has docker permissions by running
(cwl) $ docker run hello-world
You should see a message that starts with:
Hello from Docker!
This message shows that your installation appears to be working correctly.
The CWL software is available from GitHub at https://github.com/ncbi/pgap. Download the source code package for the latest release, which is located at https://github.com/ncbi/pgap/releases, and extract the code.
(cwl) $ wget -qO- https://github.com/ncbi/pgap/archive/2018-09-18.build3190.tar.gz | tar xvz
The supplemental data is stored on S3. It is versioned, and must match the CWL and Docker versions. A handy script to download the matching version is provided in the CWL source tree. This will download and extract the data to the input subdirectory.
(cwl) $ ./scripts/fetch_supplemental_data.sh
The input.yaml file provides most of the required input parameters for the data in the input subdirectory. The other parameters are specific to the genome being annotated, and must be provided by the user. An example MG37 genome is provided with the CWL source. To execute the example:
(cwl) $ cat input.yaml MG37/input.yaml > mg37_input.yaml
(cwl) $ ./wf_pgap_simple.cwl mg37_input.yaml