K-Median Clustering Algorithm in Machine Learning and its Python Implementation

Dr. Soumen Atta, Ph.D.
3 min readFeb 19, 2024
K-Median Clustering Algorithm in Machine Learning and its Python Implementation

The k-median algorithm is a clustering algorithm that is used to partition a dataset into k clusters where each cluster is represented by the median of its data points. Unlike the k-means algorithm, which uses the mean as the centroid, the k-median algorithm uses the median, making it more robust to outliers.

1. Introduction to k-median algorithm

1.1 What is k-median?

The k-median algorithm is a partitional clustering algorithm that aims to partition a dataset into k clusters in a way that minimizes the total distance between data points and their respective cluster medians.

1.2 Use Cases

K-median is particularly useful when dealing with datasets where the mean may be sensitive to outliers, and you want a more robust measure of central tendency for each cluster.

1.3 Key Concepts

  • Median: The median is the middle value of a dataset when it is ordered. It is less sensitive to extreme values compared to the mean.
  • Objective Function: The algorithm minimizes an objective function, which is the sum of distances between data points and their respective cluster medians.

--

--

Dr. Soumen Atta, Ph.D.
Dr. Soumen Atta, Ph.D.

Written by Dr. Soumen Atta, Ph.D.

I am a Postdoctoral Researcher at the Faculty of IT, University of Jyväskylä, Finland. You can find more about me on my homepage: https://www.soumenatta.com/

No responses yet