What does the --largest mean? #39

hui-liu · 2020-10-16T08:15:46Z

Hi,

If I set the --largest to the default (100), I got more than 2000 modules and the size of all modules was smaller than or equal to 100. And if setting the --largest to the total number of genes, I just got ~30 modules, and the biggest one can be 20000 nodes. When I changed the value of --largest, the other parameters were the same.

Based on the manual, I thought the --largest means that the max size of output modules should smaller than the value I set, and those larger than the value should be discarded. However, as I mentioned above, it does more than just limit the module size of the output.

I am wondering why the default was 100 and if I don't want to limit the module size of output how should I set the parameters in MONET?

Thank you so much!

The text was updated successfully, but these errors were encountered:

sergio-gomez · 2020-10-16T15:45:47Z

Hi,

The default value 100 was a requirement of the Disease Module Identification (DMI) DREAM Challenge: modules with sizes below 3 or above 100 were discarded in the evaluation process, since they were supposed to contain no valuable information in the identification of modules related with diseases. Each algorithm uses a different strategy to find the modules and to fulfil these size restrictions. The behaviour you describe is perfectly normal in community detection when there is large heterogeneity in module sizes: without an upper bound on the size, you can have a few very large modules, and many more smaller modules. The upper bound breaks the large modules in smaller ones (probably, submodules of the primary modules).

I can explain a little bit more about my algorithm, M1 (the modularity-based one). It is based on the adjustment of the resolution at which you want to find the communities. The resolution parameter lets you tune the resistance of nodes to form communities. With a high value, you get many small communities; with a small value, the modules are few and large. To generate modules within the challenge bounds, we search for a reasonable value of the resistance parameter, such that most of the nodes are contained in modules with sizes inside the desired range. To avoid excessive fragmentation, first, we let modules larger than desired (say e.g., about 5*100=500 nodes), and then, we refine those modules with a subsequent additional fragmentation in smaller modules.

Hope this helps to understand the behaviour of MONET.

All the best.

mattiat added the user_FAQ label Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does the --largest mean? #39

What does the --largest mean? #39

hui-liu commented Oct 16, 2020

sergio-gomez commented Oct 16, 2020

What does the --largest mean? #39

What does the --largest mean? #39

Comments

hui-liu commented Oct 16, 2020

sergio-gomez commented Oct 16, 2020