-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to download kraken2 database #904
Comments
Also during download [INFO - 2025-01-15 12:05:11,062]: Calculating MD5 sum for /data/mohitsharma/kraken2_db/library/bacteria/genomes/all/GCF/031/477/395/GCF_031477395.1_ASM3147739v1/GCF_031477395.1_ASM3147739v1_genomic.fna.gz |
The commands i used |
Hello, This is a strange occurence. Is the gzipped file present in the stated location? |
After installation i have these files Which gzipped file are you talking about and where this is supposed to be present and during downloading of database why some are not downloading? (base) mohitsharma@deep:/data/mohitsharma/kraken2$ !835 |
NCBI will sometimes randomly drop connections either when the number of concurrent connections (threads) is high or the client is trying to retreive a large amount of data. To make sure all files are downloaded k2 will append the file that failed to download to list of files that are yet to be retrieved and try again later. The log for a file that failed to download should look like this:
If that file is not retrieved later then it is bug in I hope that makes sense. |
Thanks for response but my download got terminated midway |
This seems to have happened while downloading the bacteria library. If all the other libraries that make up the standard database downloaded successfully you can then download the bacteria library separately by running: If there are other libraries that need downloading try rerunning:
I hope this helps. |
[INFO - 2025-01-15 12:09:25,528]: Assigning taxonomic IDs to sequences
concurrent.futures.process._RemoteTraceback: 5/52169 project(s), 6 sequence(s), 12.30 Mbp
"""
Traceback (most recent call last):
File "/home/mohitsharma/miniforge-pypy3/lib/python3.10/concurrent/futures/process.py", line 246, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/data/mohitsharma/kraken2/kraken2/k2", line 800, in assign_taxids
with open(out_filepath, "w") as out_file:
FileNotFoundError: [Errno 2] No such file or directory: 'genomes/all/GCF/000/007/145/GCF_000007145.1_ASM714v1/GCF_000007145.1_ASM714v1_genomic.fna'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/mohitsharma/kraken2/kraken2/k2", line 3706, in
k2_main()
File "/data/mohitsharma/kraken2/kraken2/k2", line 3675, in k2_main
build_standard_database(args)
File "/data/mohitsharma/kraken2/kraken2/k2", line 2505, in build_standard_database
download_genomic_library(args)
File "/data/mohitsharma/kraken2/kraken2/k2", line 1899, in download_genomic_library
sequence_to_url = assign_taxid_to_sequences(
File "/data/mohitsharma/kraken2/kraken2/k2", line 1184, in assign_taxid_to_sequences
result = future.result()
File "/home/mohitsharma/miniforge-pypy3/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/home/mohitsharma/miniforge-pypy3/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
FileNotFoundError: [Errno 2] No such file or directory: 'genomes/all/GCF/000/007/145/GCF_000007145.1_ASM714v1/GCF_000007145.1_ASM714v1_genomic.fna'
The text was updated successfully, but these errors were encountered: