Guide: Wilcoxon Signed-Rank Test

The Wilcoxon-signed rank test is a non-parametric test we can run in Julius. It is designed to evaluate the median differences between paired observations (before and after), which directly affect one another and are not independent. This makes it ideal for measuring before-and-after studies, repeated on the same individual on the same sample, or matched pairs of data.

The Wilcoxon signed-rank test does not assume normal distribution, which is often required by parametric tests and is particularly useful for smaller datasets. Additionally, it is not sensitive to outliers, as it ranks the data before analyzing the differences. Here are the assumptions we need to ensure this dataset follows:

  1. Paired Data: The test requires that the data consists of pairs of observations (before and after).
  2. Continuous or Ordinal Data: The data must be able to take any value within a range.
  3. Symmetry of the Distribution Differences: The differences between the paired measurements should be symmetrically distributed around the median.
  4. Independence Within Pairs: The differences between each individual test subject should not influence the other.

In addition to these, your data should also follow the non-parametric assumptions. Please check my other guide – Guide: Parametric or Non-parametric – for more information on that! Now that we know a little more about this test, let’s see an example on how to run it in Julius!

Prompt:
You’re a coach for the local 18U superhuman cross-country team. You want to assess if there is an improvement in the athletes’ sprint times before and after they enroll in this new training program. You record the time (in seconds) it takes for each athlete to run a 50m dash. The results are displayed below:

id before after
1 0.4693 1.4196
2 3.0101 0.2254
3 1.3167 0.5183
4 0.9129 0.6844
5 0.1696 0.9134
6 0.1696 2.3069
7 0.0598 0.3341
8 2.0112 1.083
9 0.9191 1.3463
10 1.2313 0.0713
11 0.0208 1.403
12 3.5036 0.2804
13 1.7864 0.1009
14 0.2387 4.4605
15 0.2007 5.0559
16 0.2026 2.4785
17 0.3628 0.5449
18 0.7439 0.1542
19 0.5655 1.7291
20 0.3442 0.8701

This dataset is considered a “paired” dataset because we are looking at the same individual over a period of time (paired results). It is also continuous data, meaning that it can take on any value within a range. So, let’s get to it!

Steps to running a Wilcoxon signed-rank Test

Step 1: Run Descriptive Statistics
As always, this should be the first step in any data analysis process. So let’s run some descriptive stats on Julius!

Great! Now we know more about our dataset and its characteristics. We can use this information to chat about the results if needed.

Step 2: Check Data for Normal Distribution
The first step is to see if out dataset follows normal or non-normal distribution. We can perform a Shapiro-Wilk test to determine the distribution type:


The test statistic for both the before and after data came back significant, so we can reject the null hypothesis and conclude that the data follows non-normal distribution. Additionally, the histograms show the data is skewed to the right. We can also check for outliers in this dataset by prompting Julius!

We have about four outliers in this dataset, which is fine because the Wilcoxon-signed rank test can handle them!

Step 3: Check for Symmetry of the Distribution of Differences

One of the assumptions for running the Wilcoxon signed-rank test is to ensure that the differences between the paired measurements are symmetrically distributed around the median. To do this manually, you would subtract the after after score from the before score for each pair of observations and plot that on a histogram. However, we can prompt Julius to create a visualization for us:

There is slight skewness between in my differences, indicated by the slight peaks outside the median value on either side and the gap between values 2 to 4. However, this skewness is not too detrimental to the analysis. Therefore, we can continue to the final step!

Step 4: Perform the Wilcoxon-Signed Rank Test

So now we know the following about our dataset:

  1. it is considered a paired observation because it follows the same individual over a period of time, but each pair is independent from the other pairs (meaning that each runner doesn’t impact the other runners score in any way).
  2. The data is continuous data and independent from one another.
  3. We have (mostly) symmetrical distribution of differences.
  4. The data is not normally distributed.
  5. There are some outliers in the dataset (~4)

So our superhuman runners are too good for the training regime, according to the findings. We would report them as follows:

“A Wilcoxon signed-ranked test was used to assess the effectiveness of a new training regime on the performance of superhuman cross country runners. The test revealed a statistic of 85.0, with a p-value of 0.475, indicating there was no significant change in the spring performance after implementing the training regime (p > 0.05).”

After the analysis
For fun, let’s say that we did find statistically significant differences between before and after sprint times. What would we do next? I asked Julius to come up with a list of potential post hoc tests that we could use to further analyze the data:


Julius has provided us a nice list of potential post-hoc tests suitable for our dataset. You would choose one (or multiple) depending on your specific question and what you are trying to see. But it is nice to have options!

Keywords: AI Statistics, GPT, Wilcoxon signed-rank test, paired data, data distribution, non-normal distribution, Shapiro-Wilk test, post hoc test, statistical analysis

Reference & Further Reading

Durango, Ana & Refugio, Craig. (2018). An Empirical Study on Wilcoxon Signed Rank Test. 10.13140/RG.2.2.13996.51840.

1 Like

TLDR let me know if I missed anything.