-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
440 lines (428 loc) · 32.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="UTF-8">
<link rel="SHORTCUT ICON" href="http://mltreemap.org/treemap_images/favicon.ico">
<title>MLTreeMap: mapping DNA fragments onto the Tree of Life</title>
<link href="/css/treemap_standard_styles.v7234.css" rel="stylesheet" type="text/css">
<script type="text/javascript" async="" src="/javascript/ga.js"></script>
</head>
<body>
<script type="text/javascript">var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-9803518-2']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script');
ga.type = 'text/javascript';
ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(ga);
})();</script>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td align="center">
<table border="0">
<tbody>
<tr>
<td align="left" colspan="4" style="white-space: nowrap;">
<img src="/images/additional_images/top_mltreemap_logo.png" width="151" height="39" alt=""></a>
</td>
</tr>
<tr>
<td colspan="4" align="left" style="text-align: justify;">
<h1 style="color: #cc181e;font-size: xx-large">MLTreeMap is not being actively maintained!</h1>
<p>There's no one to actively maintain it so it's not available as a service any more.
If you still want to use it, here you can download and run it yourself. The source is available on <a href="https://github.com/meringlab/mltreemap"> github</a>.</p>
<p>If you'd like to take it over, please send us an email.</p>
</td>
</tr>
<tr>
<td colspan="4" align="left">
<h1>Downloads</h1>
</td>
</tr>
<tr>
<td colspan="4" align="left"><b>A) Phylogenetic Marker Genes</b><br> <br>MLTreeMap uses a selected
set of 40 protein-coding marker genes, deemed to be the most phylogenetically informative. <br>For
these, hand-curated alignments are available:
</td>
</tr>
<tr>
<td colspan="4"> </td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0012.fa">COG0012.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0016.fa">COG0016.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0018.fa">COG0018.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0048.fa">COG0048.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0049.fa">COG0049.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0052.fa">COG0052.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0080.fa">COG0080.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0081.fa">COG0081.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0085.fa">COG0085.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0087.fa">COG0087.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0088.fa">COG0088.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0090.fa">COG0090.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0091.fa">COG0091.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0092.fa">COG0092.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0093.fa">COG0093.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0094.fa">COG0094.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0096.fa">COG0096.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0097.fa">COG0097.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0098.fa">COG0098.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0099.fa">COG0099.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0100.fa">COG0100.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0102.fa">COG0102.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0103.fa">COG0103.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0124.fa">COG0124.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0172.fa">COG0172.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0184.fa">COG0184.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0185.fa">COG0185.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0186.fa">COG0186.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0197.fa">COG0197.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0200.fa">COG0200.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0201.fa">COG0201.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0202.fa">COG0202.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0215.fa">COG0215.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0256.fa">COG0256.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0495.fa">COG0495.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0522.fa">COG0522.fa</a></td>
</tr>
<tr>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0525.fa">COG0525.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0533.fa">COG0533.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0541.fa">COG0541.fa</a></td>
<td style="padding: 3px;"><a class="normal_reference"
href="http://mltreemap.org/treemap_download/COG0552.fa">COG0552.fa</a></td>
</tr>
<tr>
<td colspan="4"> </td>
</tr>
<tr>
<td colspan="4"> </td>
</tr>
<tr>
<td colspan="4" align="left"><b>B) Reference Phylogeny</b><br> <br>MLTreeMap can be based on any
reference phylogeny of completely sequenced genomes. <br>We currently use an extended tree-of-life
phylogeny based on the phlyogeny shown below (Ciccarelli et. al., Science 2006):
</td>
</tr>
<tr>
<td colspan="4"> </td>
</tr>
<tr>
<td colspan="4">
<table border="0">
<tbody>
<tr>
<td align="left" valign="top"><img src="/download/tree_Feb15_72dpi.gif" alt="" width="499"
height="469"></td>
<td align="left">tree-of-life reference phylogeny:<br>
<a class="normal_reference"
href="http://mltreemap.org/treemap_download/tree_of_life_circular.png">tree_of_life_circular.png</a>
<br> <br>rRNA based reference phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/rRNA_circular.png">rRNA_circular.png</a>
<br> <br>fungi reference phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/fungi_circular.png">fungi_circular.png</a>
<br> <br>RuBisCo family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/RuBisCo_circular.png">RuBisCo_circular.png</a>
<br> <br>Nitrogenase (nifH) family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/nifH_circular.png">nifH_circular.png</a>
<br> <br>Nitrogenase (nifD) family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/nifD_circular.png">nifD_circular.png</a>
<br> <br>Methane monooxygenase family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/MMO_circular.png">MMO_circular.png</a>
<br> <br>HZO/HAO family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/hzo_hao_circular.png">hzo_hao_circular.png</a>
<br> <br>dsrAB family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/dsrAB_circular.png">dsrAB_circular.png</a>
<br> <br>photolyase/cryptochrome family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/phocryp_circular.png">phocryp_circular.png</a>
<br> <br>pufM family phylogeny:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/pufM_circular.png">pufM_circular.png</a>
<br> <br>Supplementary info for the tree-of-life reference phylogeny:
<br> <br>tree data in Newick format:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/tree_of_life.txt">tree_of_life.txt</a>
<br> <br>underlying protein alignment (Phylip format):<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/tree_of_life.phy">tree_of_life.phy</a>
</td>
</tr>
<tr>
<td colspan="2" align="left">(C) Science Magazine, 2006</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td colspan="4"> </td>
</tr>
<tr>
<td colspan="4"> </td>
</tr>
<tr>
<td colspan="4" align="left"><b>C) Stand-alone MLTreeMap</b><br> <br>The pipeline of MLTreeMap can
be downloaded and installed individually:
</td>
</tr>
<tr>
<td colspan="4" align="left"><br> <br>Version history:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/Version_history.pdf">Version_history.pdf</a>
<br> <br>The MLTreeMap stand-alone package:<br>
<a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_061.tar.gz">MLTreeMap_package_2_061.tar.gz</a>
<br> <br>A guide to the installation and usage of MLTreeMap:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_061.pdf">MLTreeMap_guide_2_061.pdf</a>
<br> <br>The MLTreeMap imagemaker:<br>
<a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_061.tar.gz">MLTreeMap_imagemaker_2_061.tar.gz</a>
<br> <br>A guide to the installation and usage of the MLTreeMap imagemaker:<br>
<a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_061.pdf">MLTreeMap_imagemaker_guide_2_061.pdf</a>
<br> <br>A guide to adding new reference phylogenies to MLTreeMap:<br>
<a class="normal_reference" href="http://mltreemap.org/treemap_download/Addon_guide.pdf">Addon_guide.pdf</a>
<br> <br>Previous_versions:
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_06.tar.gz">MLTreeMap_package_2_06.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_051.tar.gz">MLTreeMap_package_2_051.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_05.tar.gz">MLTreeMap_package_2_05.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_04.tar.gz">MLTreeMap_package_2_04.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_034.tar.gz">MLTreeMap_package_2_034.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_032.tar.gz">MLTreeMap_package_2_032.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_031.tar.gz">MLTreeMap_package_2_031.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2_03.tar.gz">MLTreeMap_package_2_03.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2.011.tar.gz">MLTreeMap_package_2.011.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_package_2.01.tar.gz">MLTreeMap_package_2.01.tar.gz</a>
<br><a class="normal_reference" href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_06.pdf">MLTreeMap_guide_2_06.pdf</a>
<br><a class="normal_reference" href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_05.pdf">MLTreeMap_guide_2_051.pdf</a>
<br><a class="normal_reference" href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_05.pdf">MLTreeMap_guide_2_05.pdf</a>
<br><a class="normal_reference" href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_04.pdf">MLTreeMap_guide_2_04.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_034.pdf">MLTreeMap_guide_2_034.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_032.pdf">MLTreeMap_guide_2_032.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_guide_2_031.pdf">MLTreeMap_guide_2_031.pdf</a>
<br><a class="normal_reference" href="http://mltreemap.org/treemap_download/MLTreeMap_guide.pdf">MLTreeMap_guide.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_06.tar.gz">MLTreeMap_imagemaker_2_06.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_051.tar.gz">MLTreeMap_imagemaker_2_051.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_05.tar.gz">MLTreeMap_imagemaker_2_05.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_04.tar.gz">MLTreeMap_imagemaker_2_04.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_032.tar.gz">MLTreeMap_imagemaker_2_032.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_2_03.tar.gz">MLTreeMap_imagemaker_2_03.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker.tar.gz">MLTreeMap_imagemaker.tar.gz</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_06.pdf">MLTreeMap_imagemaker_guide_2_06.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_051.pdf">MLTreeMap_imagemaker_guide_2_051.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_05.pdf">MLTreeMap_imagemaker_guide_2_05.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_04.pdf">MLTreeMap_imagemaker_guide_2_04.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_032.pdf">MLTreeMap_imagemaker_guide_2_032.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide_2_03.pdf">MLTreeMap_imagemaker_guide_2_03.pdf</a>
<br><a class="normal_reference"
href="http://mltreemap.org/treemap_download/MLTreeMap_imagemaker_guide.pdf">MLTreeMap_imagemaker_guide.pdf</a>
</td>
</tr>
<tr>
<td colspan="4" align="left" style="text-align: justify;"><h1>Documentation</h1>
<p>MLTreeMap analyzes DNA sequences and determines their most likely phylogenetic origin. Its main use is in
metagenomics projects, where DNA is isolated directly from natural environments and sequenced (the
organisms from which the DNA originates are often entirely undescribed). MLTreeMap will search such
sequences for suitable marker genes, and will use maximum likelihood analysis to place them in the 'Tree
of Life'. This placement is more reliable than simply assessing the closest relative of a sequence using
BLAST. More importantly, MLTreeMap decides not only who is the closest relative of your query sequence,
but also how deep in the tree of life it probably branched off.<br>Additionally, MLTreeMap searches the
sequences for genes, which are coding for key enzymes of important functional pathways, such as RuBisCo,
methane monooxygenase or nitrogenase. In case of a positive hit, MLTreeMap uses maximum likelihood
analysis to place them in the respective 'gene-family tree'.</p>
<h2>Phylogenetic markers</h2>
<p>A set of 40 protein-coding, universally occurring <a class="normal_reference"
href="/treemap_html/marker_genes.txt">marker
genes</a> is used to phylogenetically assess environmental sequencing data. This set of genes has been
described previously [<a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&dopt=Abstract&list_uids=16513982">ref</a>],
and has been chosen based on systematic searches of fully sequenced genomes: the genes were required to be
universally present in all genomes known to date (including archaea and eukaryotes), and were selected
such that the average number of paralogous copies in each genome was as low as possible. The rationale
behind this choice is that such genes are apparently under strong selection against both gene loss, and
against copy number variations. This should make them least likely to tolerate horizontal gene transfer
(since horizontal transfers presumably entail episodes of either gene-absence or multiple gene copies);
such genes should therefore be most likely to represent species phylogeny. Some remaining cases of
horizontal transfer have been detected manually; these have been neutralized by artificially pruning
marker genes from the putative acceptor organisms (such that in these organisms, these genes are
considered 'missing data' in subsequent analyses [<a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&dopt=Abstract&list_uids=16513982">ref</a>].
Likewise, paralogs and additional gene copies derived from organelles were removed, until each gene family
was represented by no more than a single, full-length sequence in each reference organism.<br>In addition
to the phylogenetic analysis described above, MLTreeMap does a second one, which relies on 16s and 18s
rRNA sequences.</p>
<h2>Functional markers</h2>
<p>The following gene families have been chosen to assess functional properties of environmental
communities, due to the role of their gene products as key enzymes in the respecive metabolic pathways: <a
class="normal_reference" href="http://www.ncbi.nlm.nih.gov/pubmed/15950120">RuBisCO</a> (<a
class="normal_reference"
href="http://string-db.org/newstring_cgi/show_network_section.pl?targetmode=cogs&identifier= COG1850">COG1850</a>)
is the key enzyme of the Calvin cycle and thus essential for photosynthetic activity. <a
class="normal_reference" href="http://pubs.acs.org/doi/abs/10.1021/bi0497603">Methane monooxygenase</a>
(sequences obtained from the <a class="normal_reference" href="http://fungene.cme.msu.edu/">FunGene
database</a>) is essential for methane fixation. <a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pubmed/18973625">HZO
and HOA</a> belong to the key enzymes of the nitrification reaction. <a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pubmed/14694078">Nitrogenase</a>
(nifD: <a class="normal_reference" href="http://www.genome.jp/dbget-bin/www_bget?ko+K02586">K02586</a>,
nifH: <a class="normal_reference" href="http://www.genome.jp/dbget-bin/www_bget?ko+K02588">K02588</a>) is
essential for nitrogen fixation. The <a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pubmed/18826437">dsrAB</a> gene
is a marker for sulfur-oxidizing and sulfate-reducing prokaryotes. <a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1175950/">Cryptochromes
and photolyases</a> are a family of photoreceptors and DNA repair encymes respectively and have been
used to detect functional novelty by <a class="normal_reference"
href="http://jb.asm.org/cgi/content/full/191/1/32">Singh et al.
2009</a>.</p>
<h2>Detection of marker genes</h2>
<p>Marker genes are detected within your input query sequence using <a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/BLAST/">BLAST</a>,
by searching the DNA against clusters of orthologous groups (COGs). These COGS are maintained at the
extended COG database on the STRING server [<a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pubmed/18940858?ordinalpos=1&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum">ref</a>]
for all marker genes except the methane monooxygenase and the nitrogenase genes where the COGs have been
derived from the KEGG database [<a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pubmed/16381885?ordinalpos=2&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum">ref</a>].
COG-matches are called, for any sequence section whose first hit is a protein already assigned to a COG,
as long as the BLAST score is better than 60 bits (multiple COG-mappings are allowed, unless they overlap
by more than 50% of the length of the shorter assignment). Each open reading frame which is found to map
to one of the marker gene COGs is then re-aligned to all members of that COG using <a
class="normal_reference" href="http://hmmer.janelia.org/">HMMALIGN</a>. In cases where a single DNA
fragment maps to more than one marker gene, the alignments are concatenated. Finally, gaps in the
alignments are removed using GBLOCKS, with the following settings: Maximum Number Of Contiguous
Nonconserved Positions: 15; Minimum Length Of A Block: 3; Allowed Gap Positions: with half; Minimum Number
Of Sequences For A Flank Position: 55% of the Sequences.</p>
<h2>Maximum likelihood scoring</h2>
<p>After the above step, each query DNA fragment with at least one marker gene is represented by a multiple
sequence alignment (this alignment contains the known sequences from this marker gene family, plus a
single stretch of novel sequence). For all the known sequences in the alignment, their phylogenetic
relations are assumed to be that of the externally provided reference phylogeny of complete genomes [<a
class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&dopt=Abstract&list_uids=16513982">ref</a>]
or the one of the GEBA phylogeny [<a class="normal_reference"
href="http://www.ncbi.nlm.nih.gov/pubmed/20033048">ref</a>]. The
novel sequence (your query) could in principle be branching anywhere in that tree. The possible branching
positions effectively define an ensemble of trees, which are all identical except for the position of the
query sequence. We analyze these ensembles using <a class="normal_reference"
href="http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm">RAxML</a>,
in the context of the above alignment, employing the same maximum likelihood model (and settings) as were
used to generate the reference phylogeny itself. This procedure results in a maximum likelihood score for
each tree in the ensemble, and the most likely tree then defines the most probable placement of the query
sequence.</p>
<p>Often, however, more than one placement in the reference tree is possible, and these can be almost
equally likely - especially in the case of short (or partial) query sequences, which may not contain
enough phylogenetic information. We employ two measures here to avoid unjustified precision when assigning
such sequences: firstly, we require a minimum length of informative sequence in each query: this cut-off
is set at 80 columns of blocked alignment (shorter queries are not considered). Secondly, we assign
queries to more than one position in the reference tree if necessary (giving them a fractional weight at
each position). To do this, we use bootstrap values calculated by <a class="normal_reference"
href="http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm">RAxML</a>.
</p>
<p>The final result of the above step is a likely placement of the query sequence in the reference tree
(broken down into a weighted distribution of placements if necessary). Note that the branching pattern of
the reference phylogeny itself is never altered - only the novel sequence is assessed, relative to the
fixed reference phylogeny. </p>
<h2>Visualization</h2>
<p>In the last step, we visualize the placement of the query sequence in the context of the reference tree
(using in-house tree drawing software). The position of the blue bubble in the tree illustrates the most
likely branching position of the unknown environmental organism from which your DNA presumably originated.
If several bubbles are visible, then the placement was not possible with 100% confidence. In that case,
the relative sizes of the bubbles show the relative weights of the placements. The placements are
additionally 'projected' onto the reference taxa (as bar-charts, merely for illustration): each placement
is distributed among the reference taxa which are descendents of the placements' branching position,
dividing the weight evenly at each bifurcation in the tree while proceeding from the actual placement up
to the tips of the tree.</p>
<h2>Contact</h2>
<p>In case you have any further questions, comments or suggestions, please email the authors: <a
class="normal_reference" href="mailto:[email protected]">mering[at]imls.uzh.ch</a>.</p></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</body>
</html>