Guide: Two-way ANOVA

Does pollution and temperature affect a newly discovered (fictional!) aquatic species Pollutionia vertia? How can we test for this interaction? The answer: a two-way ANOVA!

A two-way ANOVA is a statistical method that is used to analyze the influence of two categorical independent variables on a continuous dependent variable. It assesses whether there are significant differences between the means of the dependent variable across the different groups or categories being studied. It also evaluates if there is an interaction effect between the two independent variables. How does it work? Let’s take a look!

Prompt: Fish4Life, a local fisherman group in your area, has approached your company to conduct a study on aquatic vertebrates and the impact on pollution and temperature levels on their growth. They are particularly interested in the species Pollutionia vertia, a new species of fish that is near-threatened.

To control for various outside variables, you decide to run an experiment in a laboratory setting. You randomly select 50 individuals to use in your study, recording temperature (high, low), pollution level (high, moderate, low), and the size of the fish species. After the study, you administer a miracle supplement to cleanse the fish of any negative effects of pollution and temperature before releasing them into the wild (by law this is prohibited, but this is fictional). Below is a snippet of the dataset you collected:

temperature pollution_level species_growth
low high 10.17
high high 9.40
high high 10.18
high high 6.02
low moderate 9.56
low high 10.71
low low 12.96
high low 8.96
high low 8.38

Determine if there is a significant interaction between temperature and pollution on the growth of Pollutionia vertia.

Independent categorical variables: temperature and pollution_level
Dependent continuous variable: species_growth

Assumptions of Two-way ANOVA
The assumptions of a two-way ANOVA are similar to the one-way ANOVA. Let’s recap:

1. Independence
Observations must be independent of one another. In this example, we know the study was conducted under laboratory conditions, and for simplicity’s sake, we can assume that the species were all kept in separate tanks and fed the same amount of food. In theory, they should have uniform growth (but genetics usually interferes).
Additionally, your errors or residuals should be independent of each other. This is also related to your experimental setup, and we can assume that this is true for this example.

2. Normality
The two-way ANOVA requires normality between the residuals (the differences between the observed and predicted values). Let’s test it:

Prompt: ‘Run a normality test on each combination of categorical variables and then a levenes test please on this dataset (levene’s test results in 3).’


It passes all normality tests!

3. Homogeneity of Variance (Homoscedasticity)
The variance of the dependent variable should be equal across all combinations of levels of the independent variables. This is important to test for because unequal variances can inflate Type I error rates (false-positive result), affecting the power of the analysis.
|394x106.4379695504063
Our dataset passes this test as well!

4. No significant interaction
This assumption relates to the interaction effect between the two independent variables. It checks whether the effect of one independent variable on the dependent variable is consistent across all levels of the other independent variable. This will be determined when you run the two-way ANOVA: if the p-value is less than 0.05 (also referred to as alpha or α) it indicates that there is a significant interaction.

Let’s perform the test!

Question 1: Does temperature and/or pollution level affect fish growth?

Prompt 1: ‘Can you run a two-way ANOVA to test to see if temperature and pollution_level has a significant effect on species_growth?’

Watch this magic:

NICE! Julius has run a lovely two-way ANOVA for us. Let’s break down the results:

Temperature alone has a significant effect on the growth of our species (F(1,44) = 4.820, p = 0.033). However, neither pollution level and the interaction between temperature and pollution level is statistically significant (Pollution: F(2,44) = 1.257, p = 0.295; Interaction: F(2,44) = 2.105, p = 0.134).

What now? Julius suggested exploring the relationship between temperature and species growth in more detail. As a curious scientist, I used that prompt to examine this finding in more detail:

Prompt: ‘Explore the relationship between temperature and species growth in more detail.’

The first thing Julius provides me with is a graph (who would have thought!). This bar graph highlights the differences detected via the two-way ANOVA, comparing species growth between two different temperature conditions. Pollutionia vertia appears to grow larger at lower temperatures compared to the individuals exposed to higher temperatures.

Julius then performs a t-test to further compare these differences found between temperature and species growth. The T-test indicated statistically significant results (t(48) = 2.323, p = 0.024).

Recap of Findings
We found that only temperature, specifically cooler temperatures, influenced the growth of the Pollutia vertia species (F(1,44) = 4.820, p = 0.033). We also found that, pollution level and the interaction between the two is not considered statistically significant (Pollution: F(2,44) = 1.257, p = 0.295; Interaction: F(2,44) = 2.105, p = 0.134).

We then used a t-test to confirm the relationship between temperature and species growth. The t-test further confirmed our two-way ANOVA results.

Thanks for joining me on another statistical analysis journey!

Keywords: AI statistics, AI statistical analysis, two-way anova, interaction effect, normality test, homoscedasticity

Reference

  1. Mishra, P., Singh, U., Pandey, C. M., Mishra, P., & Pandey, G. (2019). Application of student’s t -test, analysis of variance, and covariance. Annals of cardiac anaesthesia , 22 (4), 407–411. Annals of Cardiac Anaesthesia
1 Like