If you like this project and want to make it better, help out. You could report a bug, or pitch in with some development work.
Please create issue descriptions on GitHub. Be as specific as possible. Which version are you using? What did you do? What did you expect to happen? Are you planning to submit your own fix in a pull request?
This will document the installation steps to get MiCall running locally on your workstation. The steps are for Eclipse with PyDev on Ubuntu, adapt as needed to your preferred IDE or operating system.
If you want to see what's currently being worked on, check out the waffle board.
-
Check that you are running a 64-bit operating system, or bowtie2 won't work. Check About this Computer under the gear menu.
-
If you want to edit Python code using PyDev and Eclipse, you will need to install Java. Check the version of Java you have installed:
java -version
-
If the java version is lower than 1.7, then install JDK7:
sudo apt-get install openjdk-7-source
-
Check that you are now using the new version. If not, configure it.
java -version sudo update-alternatives --config java java -version
-
Check the version of Python you have installed:
python --version
-
If the Python version is lower than 2.7, then install it:
sudo apt-get install python2.7
-
Install pip the Python package manager and some packages for Python:
sudo apt-get install python-pip sudo pip install testfixtures sudo pip install requests sudo pip install python-Levenshtein
-
Install Eclipse, although you might prefer a more recent version from the Eclipse web site:
sudo apt-get install eclipse
-
Launch Eclipse. From the Help menu, choose either Eclipse Marketplace... or Install New Software....
-
In the marketplace, just type PyDev and search. In the install wizard, use the PyDev update site.
-
After installing PyDev, open Window: Preferences. Navigate down to PyDev: Interpreters: Python Interpreter.
-
Click the Quick Auto-Config button. Click OK.
-
From the File menu, choose Import.... Navigate down to Git: Projects from Git.
-
Choose Clone URI, and paste this URI: https://github.com/cfe-lab/MiCall.git
-
Take all the branches, and select master as your initial branch.
-
Select import existing projects, and finish the import.
Check that Python is already installed.
python --version
We have tested with Python 3.4.
On Windows, you can install Anaconda Python.
-
Download the latest version of bowtie2's binaries for Linux.
-
Right click and choose Extract Here. Change the folder owner to root, move it to /opt, and add it to the path.
chmod g-w -R bowtie2-2.2.1 sudo chown root:root -R bowtie2-2.2.1 sudo mv bowtie2-2.2.1 /opt # Do this for the default version of bowtie2 cd /opt/bowtie2-2.2.1 for f in bowtie2* ; do sudo ln -s /opt/bowtie2-2.2.1/$f /usr/local/bin/$f; done # Do this for other versions that have to be called explicitly cd /opt/bowtie2-2.2.8 for f in bowtie2* ; do sudo ln -s /opt/bowtie2-2.2.8/$f /usr/local/bin/$f-2.2.1; done # Now try a smoke test. cd ~ bowtie2 --version
-
HyPhy is not needed by the main pipeline, only some of the helper utilities, so you can probably skip it. Before you can build HyPhy, you will need these libraries:
sudo apt-get install build-essential python-dev libcurl4-openssl-dev libcrypto++-dev libssl-dev
On CentOS 6, the newest versions of HyPhy fail to compile. The newest version that works is v2.2.5. In order to compile this version, assuming you're using the Software Collections
python27
package, you only need to add two packages, which you can do usingyum
:sudo yum install libcurl-devel openssl-devel
-
There is a newer package for HyPhy in the hyphy-python project. Consider testing that before the next installation, but so far we've just downloaded the latest source (or v2.2.5 on CentOS 6).
-
Download the latest source for HyPhy. Right click the zip file and choose Expand Here. Then run the setup script:
cd ~/Downloads/hyphy-master/src/lib sudo python setup.py install
You can test it out if you like.
cd Examples/Python python BasicHyPhy.py # Just check that there are no obvious errors.
In order to support more than one version of this library installed in parallel, install it in a Python virtual environment, then put a symbolic link to it on the path.
sudo virtualenv /usr/local/share/vcutadapt-1.11
sudo /usr/local/share/vcutadapt-1.11/bin/pip install cutadapt==1.11
sudo ln -s /usr/local/share/vcutadapt-1.11/bin/cutadapt /usr/local/bin/cutadapt-1.11
MiCall uses an implementation of a modified Gotoh algorithm for pairwise sequence alignment.
This is written in the C++ source file gotoh.cpp
, so you will need the
Python 3 development tools. To compile this into a shared library
that can be accessed from Python, go to micall/alignment
and enter the following:
sudo python setup.py install
This assumes that you have superuser permissions on your system. We have tested this installation on OS-X and Ubuntu. If you're installing on Windows, you will need to install Visual C++ for Python.
To install it on Ubuntu, use pip:
sudo pip install matplotlib
Set up the native apps virtual machine, and configure a shared folder called MiCall that points to the source code. Make sure you have a developer account on illumina.com.
Use the docker_build.py
script to build a Docker image and push it to
BaseSpace. If you add -t vX.Y
, it will add a tag to the Docker image. If you
add -a <agent id>
, it will launch the spacedock tool to process samples as a
local agent. You can also set the BASESPACE_AGENT_ID
environment variable so
you don't have to supply it every time. You can get the agent id from the Form
Builder page on BaseSpace.
sudo /media/sf_MiCall/docker_build.py -a abcde12345
If you want to distribute a stand-alone Windows executable version, you need to use PyInstaller.
-
If you haven't done all the previous steps, install Python and PyWin32.
-
If you want to rebuild the bowtie2 binaries, install it as described above. You might also want ActivePerl if you plan to play with bowtie2's Perl wrapper scripts. Micall doesn't require Perl, and you can just build with the bowtie2 binaries included in the git repository.
-
Install git for Windows, and clone the MiCall repository.
-
Follow the instructions above to install the Gotoh package from the
micall/alignment
folder. -
Copy
settings_default.py
tosettings.py
and edit the settings. Point bowtie2 at the copy in the bin folder. -
Try running micall.py and processing the FASTQ files in
micall/tests/microtest
. -
Use pip to install pyinstaller.
pip install pyinstaller
-
Run pyinstaller.
cd git\micall pyinstaller micall.spec
The application is created as dist\micall.exe
.
-
Copy settings_default.py to settings.py, and open it for editing.
-
Change
counting_processes
to match the number of processors on your computer, and setmapping_processes
to be that number divided by four. -
Copy hostfile_default to hostfile, and open it for editing.
-
You probably just want to uncomment the localhost line.
-
Try the launch configurations. They are saved in the
micall/tests/working
directory, but you should see them if you open the Run menu and choose Run configurations.... If you want to run all steps at once, skip to the next step, otherwise go through the numbered launch configurations in order. If you are not running under Eclipse, just run each command to display the list of command-line parameters. -
Copy or link all the files from the microtest folder to the working folder.
-
Run the sample_pipeline or run_processor launch configurations. They will process all the sample files in the working folder. If you are not running under Eclipse, both commands take the run folder as a command-line parameter.
-
Run the unit tests. Either run them from Eclipse, or run them from the command line like this:
cd ~/git/MiCall python -m unittest discover -p '*_test.py'
If you want to run MISEQ_MONITOR.py, you have to set up data folders for raw data and for the working folders. You'll also need to set up the QAI project and the MiseqQCReport so you can download QC data and upload results.
- Create a data folder somewhere on your workstation, like ~/data. Create subdirectories called miseq and RAW_DATA. Add folders RAW_DATA/MiSeq/runs.
- Connect to the shared drive using CIFS and mount smb://192.168.68.144/RAW_DATA as /media/RAW_DATA.
- Navigate down to /media/RAW_DATA/MiSeq/runs, pick a recent folder, and make sure it has a file named needsprocessing.
- Copy SampleSheet.csv to a sample run folder under your local RAW_DATA/MiSeq/runs folder.
- Navigate down to Data\Intensities\BaseCalls, and copy a few of the .fastq.gz files to your sample run folder under Data/Intensities/BaseCalls.
- Copy the Interop folder and the files RunInfo.xml and runParameters.xml.
- Open settings.py for editing.
- Point
home
at your local data/miseq folder. - Point
rawdata_mount
at your local RAW_DATA folder. - Set the Oracle connection information to a test database where you can upload sequence data.
- Run the Ruby console for QAI and
LabMiseqRun.import('01-Jan-2000')
for the date of your sample run. - Run the MiSeqQCReport script to upload the QC data from the sample run folder.
- Run MISEQ_MONITOR.py, it doesn't take any arguments.
When you don't understand the pipeline's output, it can be helpful to look at the raw reads in a sequence viewer like Tablet. Run the micall_basespace script on a run with a single sample, like this:
python micall_basespace.py --debug_remap --all_projects --link_run /path/to/run /working/path
The options tell it to write the debug files, use all projects, link to the run
with the sample you're interested in, and put all the working files in the
given folder. Look through the scratch folders under the working path to find
the one for the sample you're interested in. The remap step writes the mapping
results as debug_remapX_debug.sam
and debug_remapX_debug_ref.fasta
, where
X
is the remapping iteration number. You should be able to open an assembly
in Tablet using those two files. If the SAM file contains multiple regions,
you'll probably have to sort it with the micall/utils/sort_sam.py
script.
This section assumes you already have a working server up and running, and you just want to publish a new release. If you're setting up a new server, follow similar steps to setting up a development workstation. Follow these steps:
-
Check that all the issues in the current milestone are closed, and make sure the code works in your development environment. Run all the unit tests as described above, process the microtest data set in your local copy of Kive, and process all the samples from test_samples.csv using the
release_test_*.py
scripts to compare the results of the new release with the previous version. Get the comparison signed off to begin the release process. -
Check if the kiveapi package needs a new release by looking for new commits. Make sure you tested with the latest version.
-
Determine what version number should be used next. Update the version number in
settings_default.py
if it hasn't been updated already, commit, and push. -
Copy the previous pipeline on QAI/lab_miseq_pipelines to make a new version. Use the
projects_dump.py
script and compareprojects.json
to check that the projects match. -
Check the history of the
micall.alignment
folder. If it has changed since the last release, then update the version number insetup.py
. -
Create a release on Github. Use "vX.Y" as the tag, where X.Y matches the version you used in
settings_default.py
. If you have to redo a release, you can create additional releases with tags vX.Y.1, vX.Y.2, and so on. Mark the release as pre-release until you finish deploying it. -
Upgrade the scripts in Kive, and record the id of the new pipeline. You might find the Kive project's
dump_pipeline.py
andupload_pipeline.py
scripts helpful. They are in theutils
folder. First upgrade them on the test server, run them on a few samples, then upgrade them on the production server and run them on a few samples. -
Stop the
MISEQ_MONITOR.py
process after you check that it's not processing any important runs.ssh user@server tail /data/miseq/micall.log ps aux|grep MISEQ_MONITOR.py sudo kill -int <process id from grep output>
-
Get the code from Github into the server's environment.
ssh user@server cd /usr/local/share/MiCall git fetch git checkout tags/vX.Y
-
Check if you need to set any new settings by running
diff micall/settings_default.py micall/settings.py
. You will probably need to modify the version number and pipeline id, at least. Make sure thatproduction = True
. -
Check if the gotoh package is up to date. If not, install it.
cd /usr/local/share/MiCall/micall/alignment pip3 show gotoh cat setup.py # compare version numbers sudo python3 setup.py install
-
Check that the kiveapi package is the same version you tested with. If not, do a Kive release first.
cd /usr/local/share/Kive pip3 show kiveapi cat api/setup.py
-
Start the monitor, and tail the log to see that it begins processing all the runs with the new version of the pipeline. Before you launch, change all the working folders to be owned by the pipeline group.
cd /usr/local/share/MiCall sudo chgrp -R micall /data/miseq ./MISEQ_MONITOR.py & tail -f /data/miseq/micall.log
-
Launch the basespace virtual machine, and build a new Docker image from GitHub. Tag it with the release number. See the bash scripts above for an easy way to do this.
sudo docker build -t docker.illumina.com/cfe_lab/micall:vX.Y https://github.com/cfe-lab/MiCall.git
-
Push the new image to the repository. You might have to log in to docker before running this.
sudo docker push docker.illumina.com/cfe_lab/micall:vX.Y
-
Edit the
callbacks.js
in the form builder, and add the:vX.Y
tag to thecontainerImageId
field. -
Activate the new revisions in the form builder and the report builder.
-
Send an e-mail to users describing the major changes in the release.
-
Close the milestone for this release, create one for the next release, and decide which issues you will include in that milestone.