Clustering¶
Algorithms¶

jgrapht.algorithms.clustering.
k_spanning_tree
(graph, k)[source]¶ The k spanning tree clustering algorithm.
The algorithm finds a minimum spanning tree T using Prim’s algorithm, then executes Kruskal’s algorithm only on the edges of T until k trees are formed. The resulting trees are the final clusters. The total running time is \(\mathcal{O}(m + n \log n)\).
The algorithm is strongly related to single linkage cluster analysis, also known as singlelink clustering. For more information see: J. C. Gower and G. J. S. Ross. Minimum Spanning Trees and Single Linkage Cluster Analysis. Journal of the Royal Statistical Society. Series C (Applied Statistics), 18(1):54–64, 1969.
 Parameters
graph – the graph. Needs to be undirected
k – integer k, denoting the number of clusters
 Returns
a clustering as an instance of
Clustering

jgrapht.algorithms.clustering.
label_propagation
(graph, max_iterations=None, seed=None)[source]¶ Label propagation clustering.
The algorithm is a near linear time algorithm capable of discovering communities in large graphs. It is described in detail in the following paper:
Raghavan, U. N., Albert, R., and Kumara, S. (2007). Near linear time algorithm to detect community structures in largescale networks. Physical review E, 76(3), 036106.
As the paper title suggests the running time is close to linear. The algorithm runs in iterations, each of which runs in \(\mathcal{O}(n + m)\) where \(n\) is the number of vertices and \(m\) is the number of edges. The authors found experimentally that in most cases, 95% of the nodes or more are classified correctly by the end of iteration five. See the paper for more details.
The algorithm is randomized, meaning that two runs on the same graph may return different results. If the user requires deterministic behavior, a random generator seed can be provided as a parameter.
 Parameters
graph – the graph. Needs to be undirected
max_iterations – maximum number of iterations (None means no limit)
seed – seed for the random number generator, if None then the system time is used
 Returns
a clustering as an instance of
Clustering