Nonparametric Statistical Tests using Python: An Introductory Tutorial
--
This is a beginner-friendly introductory tutorial on nonparametric statistical tests using Python. Nonparametric tests in statistics are methods of statistical analysis that do not require the data to be normally distributed. Due to this reason, these types of tests are sometimes called distribution-free tests. Note that in this tutorial we are not going to discuss the theoretical details of these nonparametric statistical tests. Rather, we will discuss when and how to use these tests using Python. In addition, this tutorial assumes that the readers have a working knowledge of Python programming language. In this tutorial, we will use the SciPy (pronounced “Sigh Pie”) Python package used for mathematics, science, and engineering applications. SciPy is open-source.
In this tutorial, we discuss four nonparametric statistical tests. They are as follows:
- Mann-Whitney U Test,
- Wilcoxon Signed-Rank Test,
- Kruskal-Wallis H Test and
- Friedman Test.
Mann-Whitney U Test
This test is used to check whether the distributions of two independent samples are equal or not. This test is also known as the Mann-Whitney U rank test which is applicable to two independent samples.
We can apply Mann-Whitney U Test if and only if the following assumptions are followed by the sample data:
- observations in each sample are independent and identically distributed (iid),
- observations in each sample can be ranked.
This test checks the null hypothesis (H0) against the alternative hypothesis (H1), and they are as follows:
- H0: the distributions of both samples are equal,
- H1: the distributions of both samples are not equal.
Here, we use the Mann-Whitney U test in Python using the mannwhitneyu() SciPy function. The function takes the two data…