Replies: 1 comment 7 replies
-
@SilasK @ Stefan-L Are you hoping to annotate genes using a large sequence database (e.g., Uniprot) or using a small collection of custom genes that are of interest to your project? This will affect the best method to use. In the first case, I think Silas' answer is already pretty complete. EggNOG/DRAM annotations should be fairly comprehensive, and DRAM can support other databases (e.g, UniRef) if needed. There are also other bioinformatics tools out there specifically designed for annotation via other sequence databases. In the second case, the database will vary a lot depending on the project. In some cases, if you can find some HMMs that target the genes you are interested in (e.g., using Pfam or FunGene), you can use hmmsearch to quickly scan the ATLAS Genecatalog to look for hits. You can then analyze those hits in more detail using multiple sequence alignments and such. If you want to use MMSeqs, you can use the seed sequences of the HMM to create a MMSeqs Profile for the search. In other cases, tools sometimes exist to specifically profile the type of genes you are interested in (e.g., FeGenie for iron-cycling genes), and it might be best to just use those tools directly and apply them to the ATLAS data. MetAnnotate is mainly designed for annotating short read data, i.e., peptides predicted directly from unassembled metagenome reads. It relies on the RefSeq database for taxonomic annotation of short read hits. Hope this helps! |
Beta Was this translation helpful? Give feedback.
-
@ Stefan-L asked how to annotate genes with custom genes.
My response would be that obviously, it's best to check if there is anything in the EggNOG gene annotations or in the genome annotation produced by DRAM (e.g. KEgg) that corresponds to your gene of interest. It is also possible to annotate MAGs with UniRef with DARAM but this required big databases.
So what to do if you are interested in a new custom gene annotation.
First, you would need some genes that are known to have your function. And then you can search the genecatalog against them. Some time ago I made this small pipeline which would do this. https://github.com/SilasK/MMprofiler
I'm also interested in your ideas @jmtsuji @LeeBergstrand with metaanotate.
Beta Was this translation helpful? Give feedback.
All reactions