Member-only story

Hierarchical Clustering in Python: A Step-by-Step Tutorial

Dr. Soumen Atta, Ph.D.
9 min readApr 3, 2023

--

Hierarchical clustering is a powerful and widely-used clustering technique that groups similar data points into clusters based on their similarities or dissimilarities. This technique is particularly useful in exploratory data analysis, where the goal is to identify underlying patterns or structures within the data.

Hierarchical clustering is divided into two categories, agglomerative and divisive.

  • In agglomerative clustering, each data point is initially treated as a separate cluster, and then the algorithm iteratively merges the closest pairs of clusters until all data points are assigned to a single cluster.
  • In divisive clustering, the opposite approach is used, starting with a single cluster and then recursively dividing it into smaller clusters.

In this tutorial, we will focus on agglomerative hierarchical clustering, which is the most common type of hierarchical clustering used in practice. We will start by explaining the basic concepts of hierarchical clustering, including linkage criteria, distance measures, and dendrograms. We will then proceed to the step-by-step implementation of hierarchical clustering in Python, using the popular scikit-learn library. We will also discuss how to visualize the results of hierarchical clustering using dendrograms and heatmaps, and how to evaluate the quality of the resulting clusters using metrics such as silhouette score and inertia.

Overall, this tutorial will provide you with a solid foundation in hierarchical clustering and how it can be used for clustering analysis in data science. By the end of the tutorial, you will be able to perform hierarchical clustering on your own datasets and interpret the results.

In this tutorial, we will implement agglomerative hierarchical clustering using Python and the scikit-learn library. We will use the Iris dataset as our example dataset, which contains information on the sepal length, sepal width, petal length, and petal width of three different types…

--

--

Dr. Soumen Atta, Ph.D.
Dr. Soumen Atta, Ph.D.

Written by Dr. Soumen Atta, Ph.D.

I am a Postdoctoral Researcher at the Faculty of IT, University of Jyväskylä, Finland. You can find more about me on my homepage: https://www.soumenatta.com/

No responses yet

Write a response