Member-only story
Simple and multiple linear regression analysis for rainwater quality checking
In this tutorial, we will provide a step-by-step guide on how to perform Simple Linear Regression (SLR) and Multiple Linear Regression (MLR) for rainwater quality analysis using Python.
Introduction
Rainwater is an important natural resource, and its quality can have significant impacts on human health and the environment. In order to analyze the quality of rainwater, it is often useful to use statistical models to understand the relationship between different variables. Simple linear regression (SLR) and multiple linear regression (MLR) are two commonly used techniques for this purpose.
In this tutorial, we will provide a step-by-step guide on how to perform SLR and MLR for rainwater quality analysis using Python.
Dataset
Here, we will use an artificial dataset. We will create this dataset for this tutorial. Note that this dataset is randomly created. The Python code to generate such a dataset is given below:
import pandas as pd
import random
# create an example dataset with 250 entries
data = {
'pH': [random.uniform(6, 8) for i in range(250)],
'Conductivity': [random.randint(100, 1000) for i in range(250)],
'Temperature': [random.randint(20, 30) for i in range(250)],
'TDS': [random.randint(100, 200) for i in range(250)]
}
# create a pandas DataFrame from the dictionary
df = pd.DataFrame(data)
The program creates an example dataset with 250 entries using the Python random
module and the Pandas library. The dataset has four columns: pH, conductivity, temperature, and TDS. Each column has 250 random values generated using different methods:
- The
pH
column has random values generated using theuniform
function from therandom
module, which generates random floating-point numbers between 6 and 8 (inclusive). - The
Conductivity
column has random values generated using therandint
function from therandom
module, which generates random integers between 100 and 1000 (inclusive). - The
Temperature
column has random values generated using therandint
function from therandom
module, which generates random integers between 20 and 30 (inclusive).