Exploratory Data Analysis using Python Pandas: A Tutorial

Dr. Soumen Atta, Ph.D.
7 min readMar 14, 2021
Exploratory Data Analysis using Python Pandas A Tutorial

In this tutorial, we will learn about exploratory data analysis using Python Pandas. In exploratory data analysis, we analyze the input dataset to summarize its main characteristics. Sometimes, we examine the main features of the input dataset visually using different standard plots.

This is a beginner-friendly tutorial. Here, we assume that the readers are familiar with the basic Python programming language. If you are new to Python programming and have never used Pandas before, you can read the following beginner-friendly tutorial on Python Pandas.

In this tutorial, we will use a well-known dataset, known as Pima Indian Diabetes data. This dataset can be downloaded from this link. This dataset consists of several medical predictor (independent) variables and one target (dependent) variable. Here, the input data is available as a CSV (comma-separated value) file. In this tutorial, we will only be using the Python Pandas package.

Load the input data

--

--

Dr. Soumen Atta, Ph.D.
Dr. Soumen Atta, Ph.D.

Written by Dr. Soumen Atta, Ph.D.

I am a Postdoctoral Researcher at the Faculty of IT, University of Jyväskylä, Finland. You can find more about me on my homepage: https://www.soumenatta.com/

No responses yet