From 3f924a0c0f8eb9d6c677323ffd6039825a0d0ce6 Mon Sep 17 00:00:00 2001 From: Peter Skewes-Cox Date: Fri, 25 Aug 2017 17:07:31 -0700 Subject: [PATCH] Added taxon to RefSeq genome FASTA files example MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Takes in the scientific name of a taxon on the command line (should usually be species or strain level to keep number of results manageable), retrieves taxID using `esearch | efetch | xtract`, which is nested by process substitution into `elink | efilter | efetch`. Tested and works in bash on 8/25/2017 – could use independent confirmation. --- README.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/README.md b/README.md index 1e5ef38..21899ce 100644 --- a/README.md +++ b/README.md @@ -387,3 +387,18 @@ efetch -format xml | \ xtract -pattern PubmedArticle -element MedlineCitation/PMID \ -block PubDate -sep " " -element Year,Month MedlineDate ``` + +### Given a taxon name, retrieve all RefSeq genomes for that taxon in FASTA format + +Description (optional): Takes in the scientific name of a taxon on the command line (should usually be species or strain level to keep number of results manageable), retrieves taxID using `esearch | efetch | xtract`, which is nested by process substitution into `elink | efilter | efetch`. +Written by: Peter Skewes-Cox (8/25/2017) +Confirmed by: +Databases: taxonomy, nuccore + +``` +elink -db taxonomy -id $( esearch -db taxonomy -query "Hepatitis C virus" | \ +efetch -format docsum | \ +xtract -pattern DocumentSummary -element TaxId ) -target nuccore | \ +efilter -query "refseq" | \ +efetch -format fasta +```