ResFinder identifies acquired antimicrobial resistance genes in total or partial sequenced isolates of bacteria.
It is no longer recommended to clone the ResFinder bitbucket repository unless you plan to do development work on ResFinder.
Instead we recommend installing ResFinder using pip as described below.
There are several good reasons why the recommended installation procedure has changed, among those are the increasing size of the repository that has risen to several hundreds of megabytes, due to the long history of ResFinder. Its easier for users. And it makes sure your installation will be a tested release of the application.
ResFinder consists of an application and 1-3 databases. The databases can be used without the application, but not the other way around. Below ResFinder, the application, will be installed first and then the databases will be installed and configured to work with ResFinder the application.
ResFinder uses two external alignment tools that must be installed.
- BLAST
- KMA
If you don't want to specify the path of BLAST every time you run ResFinder, make sure that "blastn" is in you PATH or set the environment variable specified in the "Environment Variables Table" in this README.
Blastn can be obtained from:
https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
# Example of how to set the environment variable in the bash shell. Remember this is only temporary, if you want it set every time you log in you need to add this line to your .bashrc, .zshrc file.
export CGE_BLASTN="/path/to/some/dir/blastn"
If you don't want to specify the path of KMA every time you run ResFinder, make sure that KMA is in you PATH or set the environment variable specified in the "Environment Variables Table" in this README.
KMA can be obtained from:
https://bitbucket.org/genomicepidemiology/kma.git
# Example of how to set the environment variable in the bash shell. Remember this is only temporary, if you want it set every time you log in you need to add this line to your .bashrc, .zshrc file.
export CGE_KMA="/path/to/some/dir/kma/kma"
Important: This will install ResFinder in the environment where you run pip and potenitally update the python modules ResFinder depends on. It is recommended to run ResFinder in its own environment, in order to avoid breaking existing installations and prevent ResFinder from getting broken by future unrelated pip installations. This is described in the optional step below.
Go to the location where you want to store your environment.
# Create environment
python3 -m venv resfinder_env
# Activate environment
source resfinder_env/bin/activate
# When you are finished using ResFinder deactivate the environment
deactivate
pip install resfinder
If you don't want to specify the path to the databases every time you run ResFinder, you need to set the environment variable specified in the "Environment Variables Table" in this README.
Go to the location where you want to store the databases. Clone the datbases you need.
Note: We are currently working on hosting tarballed versions of the databases that can be downloaded, so that cloning can be avoided.
git clone https://bitbucket.org/genomicepidemiology/resfinder_db/
git clone https://bitbucket.org/genomicepidemiology/pointfinder_db/
git clone https://bitbucket.org/genomicepidemiology/disinfinder_db/
Set approximate environment variables.
# Example of how to set the environment variable in the bash shell. Remember this is only temporary, if you want it set every time you log in you need to add this line to for example your .bashrc file.
export CGE_RESFINDER_RESGENE_DB="/path/to/some/dir/resfinder_db"
export CGE_RESFINDER_RESPOINT_DB="/path/to/some/dir/pointfinder_db"
export CGE_DISINFINDER_DB="/path/to/some/dir/disinfinder_db"
The ResFinder application and the 3 databases has been build into a single image on docker hub named "genomicepidemiology/resfinder". Below is an example run, where the current working directory is bound to the container "/app" path which is the container working directory.
docker run -v "$(pwd):/app" genomicepidemiology/resfinder -ifa data/test_isolate_01.fa -o test1 -s ecoli --acquired --point
Test data can be found in the sub-directory tests/data
You can run resfinder command line using python.
NOTE: Species should be entered with their full scientific names (e.g. "escherichia coli"), using quotation marks, not case sensitive. An attempt has been made to capture some deviations like "ecoli" and "e.coli", but it is far from all deviations that will be captured.
# Example of running resfinder
python -m resfinder -o path/to/outdir -s "Escherichia coli" -l 0.6 -t 0.8 --acquired --point -ifq test_isolate_01_*
# The program can be invoked with the -h option
usage: __main__.py [-h] [-ifa INPUTFASTA] [-ifq INPUTFASTQ [INPUTFASTQ ...]] [--nanopore] -o OUTPUTPATH [-j OUT_JSON] [-b BLASTPATH] [-k KMAPATH] [-s SPECIES] [--ignore_missing_species] [-db_res DB_PATH_RES]
[-db_res_kma DB_PATH_RES_KMA] [-acq] [-ao ACQ_OVERLAP] [-l MIN_COV] [-t THRESHOLD] [-d] [-db_disinf DB_PATH_DISINF] [-db_disinf_kma DB_PATH_DISINF_KMA] [-c] [-db_point DB_PATH_POINT]
[-db_point_kma DB_PATH_POINT_KMA] [-g SPECIFIC_GENE [SPECIFIC_GENE ...]] [-u] [-l_p MIN_COV_POINT] [-t_p THRESHOLD_POINT] [--ignore_indels] [--ignore_stop_codons] [-v] [--pickle]
options:
-h, --help show this help message and exit
-ifa INPUTFASTA, --inputfasta INPUTFASTA
Input fasta file.
-ifq INPUTFASTQ [INPUTFASTQ ...], --inputfastq INPUTFASTQ [INPUTFASTQ ...]
Input fastq file(s). Assumed to be single-end fastq if only one file is provided, and assumed to be paired-end data if two files are provided.
--nanopore If nanopore data is used
-o OUTPUTPATH, --outputPath OUTPUTPATH
Output directory. If it doesnt exist, it will be created.
-j OUT_JSON, --out_json OUT_JSON
Specify JSON filename and output directory. If the directory doesnt exist, it will be created.
-b BLASTPATH, --blastPath BLASTPATH
Path to blastn
-k KMAPATH, --kmaPath KMAPATH
Path to KMA
-s SPECIES, --species SPECIES
Species in the sample
--ignore_missing_species
If set, species is provided and --point flag is set, will not throw an error if no database is found for the provided species. If species is not found. Point mutations will silently be ignored.
-db_res DB_PATH_RES, --db_path_res DB_PATH_RES
Path to the databases for ResFinder.
-db_res_kma DB_PATH_RES_KMA, --db_path_res_kma DB_PATH_RES_KMA
Path to the ResFinder databases indexed with KMA. Defaults to the value of the --db_res flag.
-acq, --acquired Run resfinder for acquired resistance genes
-ao ACQ_OVERLAP, --acq_overlap ACQ_OVERLAP
Genes are allowed to overlap this number of nucleotides. Default: 30.
-l MIN_COV, --min_cov MIN_COV
Minimum (breadth-of) coverage of ResFinder within the range 0-1.
-t THRESHOLD, --threshold THRESHOLD
Threshold for identity of ResFinder within the range 0-1.
-d, --disinfectant Run resfinder for disinfectant resistance genes
-db_disinf DB_PATH_DISINF, --db_path_disinf DB_PATH_DISINF
Path to the databases for DisinFinder.
-db_disinf_kma DB_PATH_DISINF_KMA, --db_path_disinf_kma DB_PATH_DISINF_KMA
Path to the DisinFinder databases indexed with KMA. Defaults to the value of the --db_res flag.
-c, --point Run pointfinder for chromosomal mutations
-db_point DB_PATH_POINT, --db_path_point DB_PATH_POINT
Path to the databases for PointFinder
-db_point_kma DB_PATH_POINT_KMA, --db_path_point_kma DB_PATH_POINT_KMA
Path to the PointFinder databases indexed with KMA. Defaults to the value of the --db_path_point flag.
-g SPECIFIC_GENE [SPECIFIC_GENE ...], --specific_gene SPECIFIC_GENE [SPECIFIC_GENE ...]
Specify genes existing in the database to search for - if none is specified all genes are included in the search.
-u, --unknown_mut Show all mutations found even if in unknown to the resistance database
-l_p MIN_COV_POINT, --min_cov_point MIN_COV_POINT
Minimum (breadth-of) coverage of Pointfinder within the range 0-1. If None is selected, the minimum coverage of ResFinder will be used.
-t_p THRESHOLD_POINT, --threshold_point THRESHOLD_POINT
Threshold for identity of Pointfinder within the range 0-1. If None is selected, the minimum coverage of ResFinder will be used.
--ignore_indels Ignore frameshift-causing indels in Pointfinder.
--ignore_stop_codons Ignore premature stop codons in Pointfinder.
-v, --version Show programs version number and exit
--pickle Create a pickle dump of the Isolate object. Currently needed in the CGE webserver. Dependency and this option is being removed.
Environment variables recognized by ResFinder, the flag they replace and the default value for the flag. Provided commandline flags will always take precedence. Set environment variables takes precedence over default flag values.
Additional Environment variables can be added by appending entries to the file named "environment_variables.md".
Environment Variabel | Flag | Default Value |
---|---|---|
CGE_KMA | kmaPath | kma |
CGE_BLASTN | blastPath | blastn |
CGE_RESFINDER_RESGENE_DB | db_path_res | None |
CGE_RESFINDER_RESPOINT_DB | db_path_point | None |
CGE_RESFINDER_GENE_COV | min_cov | 0.60 |
CGE_RESFINDER_GENE_ID | threshold | 0.80 |
CGE_RESFINDER_POINT_COV | min_cov_point | 0.60 |
CGE_RESFINDER_POINT_ID | threshold_point | 0.80 |
CGE_DISINFINDER_DB | db_path_disinf | None |
CGE_DISINFINDER_DB_KMA | db_path_disinf_kma | kma |
The species option for ResFinder can take any scientific species name or "other". A few species abbreviations are accepted, as written in the "species_abbreviarions.md" file. If the option "--ignore_missing_species" is set, then the application will not complain if the given species is not recognized. All recognized species have point mutation databases associated, but not all recognized species have associated species panels associated. Species panels indicate AMR relevant for a particular species. The table below indicate which species are recognized by ResFinder. If only genus is indicated, then all species within the genus are recognized.
Species | Point Mutation DB | Species Panels |
---|---|---|
Campylobacter | yes | yes |
Campylobacter coli | yes | yes |
Campylobacter jejuni | yes | yes |
Enterococcus faecalis | yes | yes |
Enterococcus faecium | yes | yes |
Escherichia coli | yes | yes |
Helicobacter pylori | yes | no |
Klebsiella | yes | no |
Mycobacterium tuberculosis | yes | yes |
Neisseria gonorrhoeae | yes | no |
Plasmodium falciparum | yes | no |
Salmonella | yes | yes |
Staphylococcus aureus | yes | yes |
ResFinder understands the species abbreviations listed in the Species Abbreviations Table. Additional species abbreviations can be added by appending entries to the file "species_abbreviations.md".
Species | Abbreviation |
---|---|
campylobacter jejuni | c. jejuni |
campylobacter jejuni | c.jejuni |
campylobacter jejuni | c jejuni |
campylobacter jejuni | cjejuni |
campylobacter coli | c. coli |
campylobacter coli | c.coli |
campylobacter coli | c coli |
campylobacter coli | ccoli |
escherichia coli | e. coli |
escherichia coli | e.coli |
escherichia coli | e coli |
escherichia coli | ecoli |
salmonella enterica | s. enterica |
salmonella enterica | s.enterica |
salmonella enterica | s enterica |
salmonella enterica | senterica |
A webserver implementing the methods is available at the CGE website and can be found here: https://cge.food.dtu.dk/services/ResFinder/
ResFinder outputs several files. A brief description of these is given below.
- pheno_table_species.txt: table with species specific AMR phenotypes.
- pheno_table.txt: table with all AMR phenotypes.
- PointFinder_prediction.txt: tab seperated table. 1 is given to a predicted resistance against an antibiotic class, 0 is given to not resistance detected.
- PointFinder_results.txt: tab seperated table with predicted point mutations leading to antibiotic resistance.
- PointFinder_table.txt: predicted point mutations grouped into genes to which they belong.
- ResFinder_Hit_in_genome_seq.fsa: fasta sequence of resistance gene hits found in the input data (query).
- ResFinder_Resistance_gene_seq.fsa: fasta sequence of resistance gene hits found in the database (reference).
- ResFinder_results_table.txt: predicted resistance genes grouped by antibiotic class.
- ResFinder_results_tab.txt: tab seperated table with predicted resistance genes.
- ResFinder_results.txt: predicted resistance genes grouped by antibiotic class and hit alignments to reference resistance genes.
- <input_filename>.json: Output written to a CGE standardized json file. All results can be derived from this file. The format is defined here: https://bitbucket.org/genomicepidemiology/cgelib/src/master/src/cgelib/output/templates_json/beone/
When using the method please cite:
ResFinder 4.0 for predictions of phenotypes from genotypes.
Bortolaia V, Kaas RS, Ruppe E, Roberts MC, Schwarz S, Cattoir V, Philippon A, Allesoe RL, Rebelo AR, Florensa AR, Fagelhauer L,
Chakraborty T, Neumann B, Werner G, Bender JK, Stingl K, Nguyen M, Coppens J, Xavier BB, Malhotra-Kumar S, Westh H, Pinholt M,
Anjum MF, Duggett NA, Kempf I, Nykasenoja S, Olkkola S, Wieczorek K, Amaro A, Clemente L, Mossong J, Losch S, Ragimbeau C, Lund O, Aarestrup FM.
Journal of Antimicrobial Chemotherapy. 2020 Aug 11.
PMID: 32780112 doi: 10.1093/jac/dkaa345
[Epub ahead of print]
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics 2009; 10:421.
- Clausen PTLC, Aarestrup FM, Lund O. Rapid and precise alignment of raw reads against redundant databases with KMA. BMC Bioinformatics 2018; 19:307.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.