The ANOVAs: A Breakdown

Welcome to the ANOVAs! This is a group of parametric tests that look at multiple groups. If you have a complex dataset and are thinking “where do I even start?”, look no further! This information session can help you determine if your data is suitable for ANOVA analysis.

The ANOVA’s
ANOVA stands for Analysis of Variance, and this statistical analysis can only be used when analyzing the differences between among means of two or more groups or treatments. The ANOVA analyses are widely used in many fields of research, but they are particularly useful in experimental observation studies (like my thesis!).

The nitty-gritty of how ANOVAs work is that they examine the variance WITHIN each group as well as the variance BETWEEN groups. This was hard for me to wrap my head around when I had to perform a repeated-measures ANOVA the first time, but I can assure you that it is not that scary once you understand it. Let’s break it down a bit:

Let’s say you are conducting a study to compare the exam scores of students from three different schools: School A, School B and School C. You have collected the exam scores of students from each school and want to determine if there are significant differences in the average exam scores amongst the three schools. When you test for BETWEEN-GROUP variation, you’re testing the variance in the mean exam scores across ALL THREE schools. When you test for WITHIN-GROUP variation, you are testing the variation in exam scores only in a specific group. In other words, within-group variance involves examining the variance or spread of exam scores within School A, then separately in School B and then C.

By calculating the within-group variance for each school individually, this helps us gain insight into the consistency of variability of within each specific school. A higher within-group variance suggests that exam scores within that particular school are more spread out, creating larger variability amongst students’ performance on the exam within that particular school.

ANOVA compares the magnitude of the between-group variation to the within-group variation. If the between-group variation is significantly larger than the within-group variation, it suggests that there are genuine differences in the means of the groups. If there are significant differences found within-groups but not between them, it just means that individuals within each group may vary in their scores or performance, but the overall averages across the three groups are the same. Now, if both within- and between-groups are significant, this just means that there are individual differences within each group and meaningful differences across all three groups.

Types of ANOVAs
Unsurprisingly enough, statisticians have created an ARRAY of ANOVAs for different experimental setups. This was SUPER INTIMIDATING when I had to first run statistical analyses for my thesis because there were so many options to choose from. Lucky for you, I’m here to help, so you don’t have to struggle bus as hard as I did. Here I’ll list the common types of ANOVA analyses, and when you should use them:

1. One-Way ANOVA (The OG)
This is known as a single-factor ANOVA because you can use it when you have one categorical independent variable (with three or more levels) and one continuous dependent variable. It tests for differences in means among the levels of the independent variable.

Example:
*When running statistical analyses, you should change your data to be more statistically “friendly”. This means assigning numbers to each variable. This will be done for all examples. See below:

Independent variable: Treatment (with 3 levels: control = 1; low does = 2; high dose = 3)
Dependent variable: Blood pressure (continuous variable)
treatment blood_pressure

treatment blood_pressure
1 120
1 118
2 128
2 122
3 135
3 140

2. Two-Way ANOVA
Also known as factorial ANOVA, this is used when you have two categorical independent variables (factors) and one continuous dependent variable. It examines the main effect of each independent variable as well as the interaction between them.

Example:
Independent variables: Treatment (control = 1; experimental = 2) and Gender (male = 1, female = 2)
Dependent variable: Weight loss (continuous)

treatment weight gender
1 120 1
1 118 1
1 128 2
2 122 2
2 135 2
2 140 1

3. Repeated Measures ANOVA (The Bane of my Existence)
This pesky lil’ fella’ is used when you have one group of participants measured multiple times under different conditions or under different time points… or both. This test handles autocorrelation (the tendency of data points taken close together in time to be related to one another) by not treating each measurement as completely independent. What does this mean?

Story time: Imagine I’m looking at plant heights over several weeks, and instead of me measuring it once, I measure it every week (like a good biologist does). Since I’m measuring the same plant repeatedly, the previous measurements might be related to the other (autocorrelation). Repeated measures ANOVAs takes this into account and instead look at how the measurements within each plant changes over time and compares these changes between different plants/conditions. It adjusts the analysis to handle the fact that the measurements are related to each other over time.

Example:
Participant ID: Subject (number)
Conditions: treatments 1, 2, 3
Dependent variable: Heights at 3 times

treatment replicate height_1 height_2 height_3
1 1 2 4 6
1 2 3 6 9
1 3 1 2 3
2 1 5 10 15
2 2 7 14 21
2 3 8 16 24

4. Mixed ANOVA
This is used when you have one group of participants measured multiple times under different conditions or at different time points. It assesses if there are significant differences in the dependent variable in the different conditions or timepoints.

Example:
Independent Variables: Groups (Control = 1; Experimental = 2), and Time (Pre-test = 1; Post-test = 2)
Dependent Variable: Anxiety level (continuous)

group time anxiety_level
1 1 50
1 2 45
1 1 55
2 2 40
2 1 55
2 2 45

5. Multivariate ANOVA (MANOVA)
This is used when you have two or more dependent variables and one or more categorical independent variables. This assesses the overall group differences across dependent variables at the same time.

Example:
Independent Variable: Treatment (control = 1; experimental = 2)
Dependent Variable: Blood Pressure, Heart Rate

treatment blood_pressure heart_rate
1 120 70
1 118 68
1 122 72
2 130 75
2 128 73
2 132 78

6. Analysis of Covariance (ANCOVA)
This combines some features of ANOVA and regression analysis. You can use this when you have one or more continuous covariates (predictor values) in addition to one or more categorical independent variables. It helps control for the effects of the covariates on the dependent variables while you look at the group differences.

Example:
Independent Variable: Treatment (control = 1; experimental = 2)
Covariate: Age (continuous)
Dependent Variable: Weight loss (continuous)

treatment age weight_loss
1 35 2.5
1 40 3.0
1 45 2.0
2 35 4.0
2 40 3.5
2 45 4.5

7. Multivariate Analysis of Covariance (MANCOVA)
An extension of the MANOVA test to include covariates (a variable that is considered to have an effect on the dependent variable, but it is not something we are inherently testing for). It allows for the assessment of group differences across multiple dependent variables while simultaneously controlling the effects of the covariates.

Example:
Independent Variable: Treatment (control = 1; experimental = 2)
Covariate: Age (continuous)
Dependent Variables: Blood Pressure, Heart Rate

treatment age blood_pressure heart_rate
1 35 120 70
1 40 118 68
1 45 122 72
2 35 130 75
2 40 128 73
2 45 132 78

Thanks for joining me on this wild ride! I hope this little information session has helped you figure out what test you can use on your dataset!

Keywords: AI statistics, AI statistical analysis, GPT, Statistical Analysis, Parametric Tests, ANOVA, Analysis of Variance, one-way ANOVA, two-way ANOVA, repeated measures ANOVA, mixed ANOVA, multivariate ANOVA, analysis of covariance

6 Likes

This is a very helpful guide for anyone looking to start using ANOVAs in Julius - thank you for putting it together. I love the anecdotes from your own thesis and personal experience.

3 Likes

Thank you! I appreciate the compliment and I’m glad it is helpful! :grin:

3 Likes

Thank you for putting this together! It is a really nice summary and I love how you used the example datasets to highlight what each ANOVA can handle. I’m a visual learner, so this is super helpful!

Hi @Alysha
For my understanding of one way anova, I think for each treatment it is repeated perhaps due to the timing? Do you think to understand the dataset more better this would be more appropriate? I dont find a reason why it is repeated so i just made this changes, would love to hear from you :

Hi Mahmed,

Looks like you’re on the right track! I’ll explain some different tests you can use for this dataset based on what you have.

As you pointed out already, dataset involves time-specific measurements (morning and night). The analysis you choose for this could involve paired t-tests or ANOVA if there are more than two time points in the full dataset. If the morning and night readings are from the same individual, I would suggest a paired t-test (paired observations here). If they are not from the same individual, I suggest using a independent t-test or a two-way anova. I think based on how your dataset is set up, I would suggest the following:

  1. Independent t-test: use this to compare blood pressure readings between morning and night for each treatment group. This test is suitable when two samples do not come from the same person.
  2. Two-way ANOVA: you can use this to analyze the effects of the treatment and time of day on blood pressure. This would allow you to examine both main effects for each factor and their interaction without assuming they are paired.

You could technically use a one-way ANOVA if your primary interest is to just compare the means of blood pressure across different groups, and you’re not interested in time. However, if both treatment types and time points are of interest and you want to know how these factors interact independently of blood pressure, a two-way anova would be recommended.

Hope this helps clear some things up for you!