GitHub - Kirk3gaard/AccessingGenbank: A repository for automated download and processsing of Genbank files using Python

This is a collection of small python functions that I have used to automatically

download Genbank files based on a list with Genbank IDs
subsequently extract information from the Genbank files based on the fields such as "source location"

The functions are derived from a number of posts at http://stackexchange.com/ that I have been looking through to solve this task. I have tested the functionality in the Anaconda environment on a Windows PC (http://continuum.io/downloads)

(Python 2.7.6 |Anaconda 1.9.2 (64-bit)| (default, Nov 11 2013, 10:49:15) [MSC v.1500 64 bit (AMD64)] on win32)

I have attached:

A demo for running the workflow "script.py"
A module with functions for downloading and parsing genbank files "Genbank_module.py"
an example file with genbank IDs "ACCESSION_IDs.txt"

Running the "script.py" with these files in the working directory should result in:

an output file "source_list.txt"
a number of genbank files

A reminder

"In order not to overload the E-utility servers, NCBI recommends that users post no more than three URL requests per second" - http://www.ncbi.nlm.nih.gov/books/NBK25497/

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
ACCESSION_IDs.txt		ACCESSION_IDs.txt
Genbank_module.py		Genbank_module.py
README.md		README.md
script.py		script.py
source_list.txt		source_list.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Kirk3gaard/AccessingGenbank

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages