Member-only story

How to Encode Categorical Variables with Target Encoding in Python for Machine Learning

3 min readMay 2, 2023

Categorical variables are commonly found in datasets and are used to represent data in categories. These variables need to be encoded into numerical values for machine learning algorithms to process them. There are various encoding techniques that can be used to encode categorical variables, and one of the popular techniques is Target Encoding. Target Encoding involves encoding categorical variables based on the mean value of the target variable for each category. In this tutorial, we will learn how to encode categorical variables with Target Encoding in Python for machine learning.

Prerequisites

Before we start, make sure you have the following Python packages installed:

pandas
scikit-learn
category_encoders

You can install these packages using pip install command.

Dataset

For this tutorial, we will use the Titanic dataset which is available on Kaggle. This dataset contains information about passengers on the Titanic and whether they survived or not. We will use the “Survived” column as the target variable, and the “Pclass” and “Sex” columns as the categorical variables to encode. Here’s the link to the…

How to Encode Categorical Variables with Target Encoding in Python for Machine Learning

Prerequisites

Dataset

Written by Dr. Soumen Atta, Ph.D.

No responses yet