class MarkovClusterer extends Clusterer with Error
The MarkovClusterer
class implements a Markov Clustering Algorithm 'MCL'
and is used to cluster nodes in a graph. The graph is represented as an
edge-weighted adjacency matrix (a non-zero cell indicates nodes i and j are
connected).
The primary constructor takes either a graph (adjacency matrix) or a
Markov transition matrix as input. If a graph is passed in, the normalize
method must be called to convert it into a Markov transition matrix.
Before normalizing, it may be helpful to add self loops to the graph.
The matrix (graph or transition) may be either dense or sparse.
See the MarkovClustererTest
object at the bottom of the file for examples.
- Alphabetic
- By Inheritance
- MarkovClusterer
- Error
- Clusterer
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
MarkovClusterer(t: MatrixD, k: Int = 2, r: Double = 2.0)
- t
either an adjacency matrix of a graph or a Markov transition matrix
- k
the strength of expansion
- r
the strength of inflation
Value Members
-
def
addSelfLoops(weight: Double = 1.0): Unit
Add self-loops by setting the main diagonal to the weight parameter.
Add self-loops by setting the main diagonal to the weight parameter.
- weight
the edge weight on self-loops to be added.
-
def
centroids(): MatrixD
Return the centroids.
Return the centroids. Should only be called after 'cluster ()'.
- Definition Classes
- MarkovClusterer → Clusterer
-
def
classify(y: VectorD): Int
This clustering method is not applicable to graph clustering.
This clustering method is not applicable to graph clustering.
- y
unused parameter
- Definition Classes
- MarkovClusterer → Clusterer
-
def
cluster(): Array[Int]
Cluster the nodes in the graph by interpreting the processed matrix t.
Cluster the nodes in the graph by interpreting the processed matrix t. Nodes not clustered will be in group 0; otherwise, they will be grouped with their strongest positive attractor.
- Definition Classes
- MarkovClusterer → Clusterer
-
def
csize(): VectorI
Return the sizes of the centroids.
Return the sizes of the centroids. Should only be called after 'cluster ()'.
- Definition Classes
- MarkovClusterer → Clusterer
-
def
distance(u: VectorD, v: VectorD): Double
Compute a distance metric (e.g., distance squared) between vectors/points 'u' and 'v'.
Compute a distance metric (e.g., distance squared) between vectors/points 'u' and 'v'. Override this methods to use a different metric, e.g., 'norm' - the Euclidean distance, 2-norm 'norm1' - the Manhattan distance, 1-norm
- u
the first vector/point
- v
the second vector/point
- Definition Classes
- Clusterer
-
def
expand(): Unit
Expansion tends to grow clusters (flow along path in graph).
Expansion tends to grow clusters (flow along path in graph). Expand by raising the matrix t to the k-th power.
-
final
def
flaw(method: String, message: String): Unit
- Definition Classes
- Error
-
def
getName(i: Int): String
Get the name of the i-th cluster.
Get the name of the i-th cluster.
- Definition Classes
- Clusterer
-
def
inflate(): Boolean
Inflation tends to strengthen strong connections and weaken weak ones.
Inflation tends to strengthen strong connections and weaken weak ones. Inflate by raising each cell to the r-th power and normalize column-by-column. If a cell is close to zero, set it to zero (prune). Also, detect convergence by making sure that the variance in each column is small enough.
-
def
name_(n: Array[String]): Unit
Set the names for the clusters.
-
def
normalize(): Unit
Normalize the matrix t so that each column sums to 1, i.e.0, convert the adjacency matrix of a graph into a Markov transition matrix.
-
def
processMatrix(): MatrixD
Return the processed matrix t.
Return the processed matrix t. The matrix is processed by repeated steps of expansion and inflation until convergence is detected.
-
def
sse(x: MatrixD): Double
Compute the sum of squared errors within the clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within the clusters, where error is indicated by e.g., the distance from a point to its centroid.
- Definition Classes
- Clusterer