Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading FASTA Records with GI Number via efetch? #58

Open
mapauley opened this issue Jun 13, 2022 · 2 comments
Open

Downloading FASTA Records with GI Number via efetch? #58

mapauley opened this issue Jun 13, 2022 · 2 comments

Comments

@mapauley
Copy link

mapauley commented Jun 13, 2022

Is there a way to tell efetch to download FASTA records such that record headers include the GI number? For example, a given header would look like this
>gi|2248537881|ref|NM_001407571.1| Homo sapiens BRCA1 DNA repair associated (BRCA1), transcript variant 6, mRNA
instead of like this
>NM_001407571.1 Homo sapiens BRCA1 DNA repair associated (BRCA1), transcript variant 6, mRNA

The command I'm using is efetch -db nuccore -input accNosRandom.txt -format fasta > seq.fna where accNosRandom.txt contains a list of accession numbers. It results in the non-GI number format.

@vkkodali
Copy link

GIs were phased out by NCBI a few years ago. While they still remain in the ASN1 records and can be fetched using efetch with the option -format gi they are not exposed in the FASTA files and it does not have an option to include the gi number in FASTA output.

@mapauley
Copy link
Author

Thanks for letting me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants