Learnt a new thing today!!!! Fuzzy c-means Clustering

Thiruthuvaraj Rajasekhar
2 min readMar 23, 2022

--

Whenever the topic of Clustering arises, K-means is the first algorithm that text books teach us. To give a new brief intro to K-means, it is a clustering technique, which takes 2 inputs,

  1. K — no of clusters
  2. Distance measure like Euclidean, Manhattan distance

Algorithm is as follow,

  1. Initialize k number of instances as cluster centroids to startwith.
  2. For remaning n-k instances, compute the distance from the k centroids and assign the instance to the cluster with least distance.
  3. Update the cluster centroids by taking the mean of instances belonging to a particular cluster.
  4. Repeat 2, 3 until there is no change in cluster centroids.

Algorithm is quite simple and very straight forward. Then why do we need an updated Algorithm?

K-Means is a Hard clustering algorithm, which means one instance can belong to only one cluster.

Fuzzy c-means is similar to K-means clustering algorithm with a difference that every instance may belong to one or more clusters.

Algorithm works as follows:

  1. Initialize utility matrix U which has `c` number of columns where c is the number of clusters.
  2. At every step, compute the cluster centroids with utility matrix U

3. Update utility matrix U as

where m is any real number greater than 1, uij is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data, cj is the d-dimension center of the cluster.

4. repeat 2,3 till we run p number of iterations or any other stopping criteria.

--

--

Thiruthuvaraj Rajasekhar

Mining Data For Insights | Passionate to write | Data Scientist