Phys Rev. E 87, 012812 (2013)

DOI: 10.1103/PhysRevE.87.012812

bibtex

Results of a recent comparative experimental assessment of methods for network community detection applied to benchmark graphs indicate that the two best methods use different objective functions but a similar local search-based optimization (LSO) procedure. This observation motivates the following research question: Given the LSO optimization procedure, how much does the choice of the objective function influence the results and in what way? We address this question empirically in a broad graph clustering context, that is, when graphs are either given as such or are k-nearest-neighbor graphs generated from a given data set. We consider normalized cut, modularity, and infomap, as well as two new objective functions. We show that all these objectives have a resolution bias, that is, they tend to prefer either small or large clusters. When removing this bias, by forcing the objective to generate a given number of clusters, LSO achieves similar performance across the considered objective functions on benchmark networks with built-in community structure. These results indicate that the resolution bias is the most important difference between objective functions in graph clustering with LSO. Spectral clustering is an alternative to LSO, which has been used to optimize the popular normalized cut and modularity objectives. We show experimentally that LSO often achieves superior performance than spectral clustering on various benchmark, real-life, and *k*-nearest-neighbor graphs. These results, the flexibility of LSO and its efficiency, provide arguments in favor of this optimization method.

- The implementation of the LSO algorithm that was used in the experiments is available from github. Documentation can also be found on that page.
- Code to calculate theoretical resolution limits (Matlab / Octave, zip).

Contact the authors at tvanlaarhoven@cs.ru.nl for any questions or comments.