K-Median Clustering Algorithm in Machine Learning and its Python Implementation
The k-median algorithm is a clustering algorithm that is used to partition a dataset into k clusters where each cluster is represented by the median of its data points. Unlike the k-means algorithm, which uses the mean as the centroid, the k-median algorithm uses the median, making it more robust to outliers.
1. Introduction to k-median algorithm
1.1 What is k-median?
The k-median algorithm is a partitional clustering algorithm that aims to partition a dataset into k clusters in a way that minimizes the total distance between data points and their respective cluster medians.
1.2 Use Cases
K-median is particularly useful when dealing with datasets where the mean may be sensitive to outliers, and you want a more robust measure of central tendency for each cluster.
1.3 Key Concepts
- Median: The median is the middle value of a dataset when it is ordered. It is less sensitive to extreme values compared to the mean.
- Objective Function: The algorithm minimizes an objective function, which is the sum of distances between data points and their respective cluster medians.