-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathdsf_manual.html
254 lines (215 loc) · 10 KB
/
dsf_manual.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
<head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Dense Subgraph Finder 1.0 Manual</title>
<style type="text/css">
.code {
background-color: lightgray;
}
</style>
<style>
</style>
</head>
<body>
<h1>DSF (Dense Subgraph Finder) 1.0 manual</h1>
1. <a href="#intro">What is Dense Subgraph Finder?</a><br>
2. <a href="#install">Installation</a><br>
3. <a href="#dsf_usage">DSF usage</a><br>
3.1. <a href="#dsf_basic">Basic options</a><br>
3.2. <a href="#dsf_advanced">Advanced options</a><br>
3.3. <a href="#dsf_examples">Examples</a><br>
3.4. <a href="#dsf_output">Output files</a><br>
4. <a href="#files_format">GRAPH file format</a><br>
5. <a href="#feedback">Feedback and bug reports</a><br>
5.1. <a href="#citation">Citation</a><br>
<!-- -------- --->
<h2 id = "intro">1. What is Dense Subgraph Finder?</h2>
<p>
Problem of finding corrupted cliques is formulated as follows:
<i>to find minimal number of edge additions and removals to convert graph into a set of cliques.</i>
Search for corrupted cliques (or <i>dense subgraphs</i>) of some graph is a very common problem arising in bioinformatics
(e.g., co-expression of genes) and studies of social interactions (e.g., recommendation services).
DSF tool takes as an input indirected graph in <a href = "#files_format">GRAPH format</a> and
outputs decomposition where the identical ids are assigned for vertices from the same dense subgraph.
Density of subgraph on N vertices and M edges is computed as ratio of M to maximal possible number of edges (N * (N - 1) / 2) and
can be tuned by user.
</p>
<!-- -------- --->
<h2 id = "install">2. Installation</h2>
DSF comes as a part of IgRepertoireConstructor package.</br>
See <a href = manual.html#install>IgRepertoireConstructor manual</a> for installation instructions.</br>
Please verify your DSF installation prior to initiate the DSF:
<pre class="code">
<code>
./dense_subgraph_finder.py --test
</code>
</pre>
If the installation is successful, you will find the following information at the end of the log:
<pre class="code">
<code>
Thank you for using Dense Subgraph Finder!
Log was written to <dsf_installation_dir>/dsf_test/dense_subgraph_finder.log
</code>
</pre>
<!-- -------- --->
<h2 id = "dsf_usage">3. DSF usage</h2>
<p>
DSF takes as an input indirected (weighted or unweighted) graph in GRAPH format and
constructs dense subgraph decomposition.
</p>
To run DSF, type:
<pre class="code">
<code>
./dense_subgraph_finder.py [options] -g <input.graph> -o <output_dir>
</code>
</pre>
<!-- --->
<h3 id = "dsf_basic">3.1. Basic options</h3>
<code>-g <input.graph></code><br>
indirected graph in GRAPH format (required).
<br><br>
<code>-o / --output <output_dir></code><br>
output directory (required).
<br><br>
<code>-t/--threads <int></code><br>
The number of parallel threads. The default value is <code>16</code>.
<br><br>
<code>--test</code><br>
Running on the toy test dataset. Command line corresponding to the test run is equivalent to the following:
<pre class = "code">
<code>
./dense_subgraph_finder.py -g test_dataset/test.graph -o dsf_test
</code>
</pre>
<code>--help</code><br>
Printing help.
<br><br>
<!-- --->
<h3 id = "dsf_advanced">3.2. Advanced options</h3>
<code>-f / --min-fillin <float></code><br>
Expected minimal edge fill-in of the constructed dense subgraphs.
Default value is <code>0.6</code>.
<br><br>
<code>-s / --min-size <int></code><br>
Minimal size of vertices in connected components of input graph where dense subgraphs will be computed.
Connected components which size do not exceed <code>min-size</code> will be decomposed trivially (all vertices are in the same dense subgraph).
Default value is <code>5</code>.
<br><br>
<code>-n / --min-snode-size <int></code><br>
Minimum node weight that is required to consider node heavy.
DSF algorithm does not glue two heavy vertices into the same dense subgraph.
If graph is edge weighted only, vertex weights are equal to 1.
Default value is <code>5</code>.
<br><br>
<code>--save-aux-files</code><br>
Saving dense subgraph decompositions for connected components.
<br><br>
<!-- --->
<h3 id = "dsf_examples">3.3. Examples</h3>
To construct dense subgraph decomposition for graph <code>input.graph</code> with
minimal edge fill-in value <code>0.9</code>, type:
<pre class="code">
<code>
./dense_subgraph_finder.py -g input.graph -f 0.9 -o dsf_test
</code>
</pre>
<!-- --->
<h3 id = "dsf_output">3.4. Output files</h3>
DSF creates working directory (which name was specified using option <code>-o</code>)
and outputs the following files there:
<ul>
<li><b>dense_subgraphs.txt</b> — dense subgraph decomposition in TXT format,
<i>i</i>-th line of <b>dense_subgraphs.txt</b> file corresponds to <i>i</i>-th vertex and contains index (zero-based) of dense subgraph contaning this vertex.
</li>
<li>
<b>dense_subgraph_finder.log</b> — full log of DSF.
</li>
</ul>
<!-- -------- --->
<h2 id = "files_format">4. GRAPH file format</h2>
DSF takes undirected edge weighted or unweigthed graphs in
<a href = "http://people.sc.fsu.edu/~jburkardt/data/metis_graph/metis_graph.html">GRAPH format</a>.
Below are examples representation of unweigthed (left) and edge weighted (middle) and edge & vertex weighted (right) graphs in GRAPH format:
<table width = 80% align = center>
<tr>
<td width="30%" align = center>
<img src = docs/manual_figs/unweighted_graph.png width = 80%>
</td>
<td width = 3%></td>
<td width="30%" align = center>
<img src = docs/manual_figs/edge_weighted_graph.png width = 80%>
</td>
<td width = 3%></td>
<td width="30%" align = center>
<img src = docs/manual_figs/edge_vertex_weighted_graph.png width = 80%>
</td>
</tr>
<tr>
<td align = center>
7 9 <br>
2 3 <br>
1 3 <br>
1 2 4 6 7 <br>
3 6 7 <br>
<br>
3 4 7 <br>
3 4 6
</td>
<td width = 3%></td>
<td align = center>
7 9 001<br>
2 <font color="blue">20</font> 3 <font color="blue">-3</font> <br>
1 <font color="blue">20</font> 3 <font color="blue">4.5</font><br>
1 <font color="blue">-3</font> 2 <font color="blue">4.5</font> 4 <font color="blue">1</font>
6 <font color="blue">0.7</font> 7 <font color="blue">0.8</font><br>
3 <font color="blue">1</font> 6 <font color="blue">-1.5</font> 7 <font color="blue">43</font><br>
<br>
3 <font color="blue">0.7</font> 4 <font color="blue">-1.5</font> 7 <font color="blue">7.8</font><br>
3 <font color="blue">0.8</font> 4 <font color="blue">43</font> 6 <font color="blue">7.8</font>
</td>
<td width = 3%></td>
<td align = center>
7 9 011<br>
<font color="red">3</font> 2 <font color="blue">20</font> 3 <font color="blue">-3</font> <br>
<font color="red">7</font> 1 <font color="blue">20</font> 3 <font color="blue">4.5</font><br>
<font color="red">2</font> 1 <font color="blue">-3</font> 2 <font color="blue">4.5</font> 4 <font color="blue">1</font> 6 <font color="blue">0.7</font> 7 <font color="blue">0.8</font><br>
<font color="red">0.6</font> 3 <font color="blue">1</font> 6 <font color="blue">-1.5</font> 7 <font color="blue">43</font><br>
<font color="red">1</font><br>
<font color="red">1</font> 3 <font color="blue">0.7</font> 4 <font color="blue">-1.5</font> 7 <font color="blue">7.8</font><br>
<font color="red">11.5</font> 3 <font color="blue">0.8</font> 4 <font color="blue">43</font> 6 <font color="blue">7.8</font>
</td>
</tr>
<tr>
<td align = justify>
The first line shows number of vertices (7) and number of edges (9).
Each of the next lines contains vertex neighbours.
E.g., the second line notes that vertex 1 is connected with vertices 2 and 3.
</td>
<td width = 3%></td>
<td align = justify>
The first line shows number of vertices (7), number of edges (9) and description <b>001</b> that denotes edge weighted graph.
Each of the next lines contains vertex neighbours and weights of corresponding edges (highlighted in blue) for graph vertices.
E.g., the second line notes that vertex 1 is connected with vertex 2 by edge of weigth <i>20</i> and
vertex 3 by the edge of weigth <i>-3</i>.
</td>
<td width = 3%></td>
<td align = justify>
The first line shows number of vertices (7), number of edges (9) and description <b>011</b> that denotes edge & vertex weighted graph.
Each of the next lines contains vertex weight (highlighed in red), its neighbours and weights of corresponding edges (highlighted in blue).
E.g., the second line notes that vertex 1 of weight 3 is connected with vertex 2 by edge of weigth <i>20</i> and
vertex 3 by the edge of weigth <i>-3</i>.
</td>
</tr>
</table><br>
Please note that numeration of vertices is 1-based.
Isolated vertices (vertices without neighbours) are denoted by empty lines for unweighted and edge weighted graphs.
<!-- -------- --->
<h2 id = "feedback">5. Feedback and bug reports</h2>
Your comments, bug reports, and suggestions are very welcome.
They will help us to further improve DSF.
<br><br>
If you have any trouble running DSF, please send us the log file from the output directory.
<br><br>
Address for communications: <a href="mailto:[email protected]">[email protected]</a>.
<h3 id = "citation">5.1. Citation</h3>
If you use DSF in your research, please refer to
<a href="http://bioinformatics.oxfordjournals.org/content/31/12/i53.long" target="_blank">Safonova et al., 2015</a>.
</body>