-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while running DIAMOND_subsystems_analysis_counter.py #86
Comments
Hi @adrianestrada1405, can you share the command you're running? It looks like there's an issue, based on your error, with one of the outputs in the m8 file not having a match in the Subsystems database being used as the reference. Is the set of tildes the only error message you see? |
Hi! Thank you so much for your quick reply, the command line I am running in a linux terminal (miniconda) is as follows: python DIAMOND_subsystems_analysis_counter.py -I S1_diamond_subsys.m8 -D subsys_db.fa -O -P The full error message reads as follows: 1M lines processed so far in 1.4564619064331055 seconds. Analysis of S1_diamond_subsys.m8 complete. Starting database analysis now. Success! |
The first error message came up when I ran the same command line (python DIAMOND_subsystems_analysis_counter.py -I S1_diamond_subsys.m8 -D subsys_db.fa -O) without the -P, I originally had run it with -P because I noticed some of the following post-processing R scripts needed the receipt generated by this python script when running it with the -P flag. |
Hi, thanks for posting the additional info! My guess is that there may be a header in the Subsystems database file that has 'sseqid' in it, which doesn't match to any actual row of data. Could you do a
and see what comes up? |
I did the grep search you suggested but could not find a match in the database.... What else could we try? |
Is the key in the S1_diamond_subsys.m8 file? Do a grep search on there? It looks like that phrase is a part of the blast m8 format headers, so I wonder if it's a header line in the m8 file. |
I just did the grep search and this came up: qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore |
It is one of the headers of the .m8 file |
Got it, that makes sense! I may look into updating the scripts to have them intelligently check to see if there's a header line, and skip it if so... but to immediately unblock you, you could just trim the header line off the m8 file and then re-run. |
I ran it again and got the same error message, just with the first protein sequence this time: Success! |
I tried running the DIAMOND_subsystems_analysis_counter.py script with the diamond .m8 output I generated by annotating against the subsystems database but I keep getting this message:
File "/work/ae180/Metatranscriptomics_files/DIAMOND_subsystems_analysis_counter.py", line 166, in
org = db_hier_dictionary[entry]
~~~~~~~~~~~~~~~~~~^^^^^^^
The text was updated successfully, but these errors were encountered: