ANOVA (Analysis of Variance)

ANOVA is a statistical technique that assesses potential differences in a scale-level dependent variable by a nominal-level variable having 2 or more categories.  For example, an ANOVA can examine potential differences in IQ scores by Country (US vs. Canada vs. Italy vs. Spain).   The ANOVA, developed by Ronald Fisher in 1918, extends the t and the z test which have the problem of only allowing the nominal level variable to have just two categories.   This test is also called the Fisher analysis of variance.

The Free software below allows you to easily conduct an ANOVA.

Get a Jump Start on Your Quantitative Results Chapter



General Purpose of ANOVA

Researchers and students use ANOVA in many ways.  The use of ANOVA depends on the research design.  Commonly, ANOVAs are used in three ways: one-way ANOVA, two-way ANOVA, and N-way Multivariate ANOVA.


A one-way ANOVA refers to the number of independent variables--not the number of categories in each variables.  A one-way ANOVA has just one independent variable.  For example, difference in IQ can be assessed by Country, and County can have 2, 20, or more different Countries in that variable.


A two-way ANOVA refers to an ANOVA using 2 independent variable.  Expanding the example above, a 2-way ANOVA can examine differences in IQ scores (the dependent variable) by Country (independent variable 1) and Gender (independent variable 2).  Two-way ANOVA’s can be used to examine the INTERACTION between the two independent variables. Interactions indicate that differences are not uniform across all categories of the independent variables.  For example, females may have higher IQ scores overall compared to males, and are much much greater in European Countries compared to North American Countries.

Two-way ANOVAs are also called factorial ANOVA.  Factorial ANOVAs can be balanced (have the same number of participants in each group) or unbalanced (having different number of participants in each group).  Not having equal size groups can make it appear that there is an effect when this may not be the case.  There are several procedures a researcher can do in order to solve this problem:

  • Discard cases (undesirable)
  • Conduct a special kind of ANOVA which can deal with the unbalanced design

There are three types of ANOVA’s that can candle an unbalanced design.  These are the Classical Experimental design (Type 2 analysis),  the Hierarchical Approach (Type 1 analysis), and the Full regression approach (Type 3 analysis).  Which approach to use depends on whether the unbalanced data occurred on purpose.

Rules of Thumb

-If the data was not intended to be unbalanced but you can argue some type of hierarchy between the factors, use the Hierarchical approach (Type 1).

-If the data was not intended to be unbalanced and you cannot find any hierarchy, use the classical experimental approach (Type 2).

-If the data is unbalanced because of the population and it was intended, use the Full Regression approach (Type 3).


A researcher can use many independent variables and this is an n-way ANOVA.  For example, potential differences in IQ scores can be examined by Country, Gender, Age group, Ethnicity, etc, simultaneously.

General Purpose - Procedure

Omnibus ANOVA test:

In an ANOVA, a researcher first sets up the null and alternative hypothesis.  The null hypothesis assumes that there is no significant difference among the groups.  The alternative hypothesis assumes that there is a significant difference among the groups.  After cleaning the data, the researcher must test the above assumptions and examine if the data meets or violates the assumptions.  They must then calculate (really the software tool is doing the calculating) the F-ratio and probability of the F.  Next, the researcher compare the critical p-value of the F-ratio with the established alpha.  In general terms, if the p-value associated with the F is smaller than p=.05, then the null hypothesis will be rejected and the alternative hypothesis is accepted.  Rejecting the null hypothesis, one concludes that the mean of the groups are not equal. Post-hoc tests tell the researcher which group in particular differs from which other group.

So what if you find statistical significance?  Multiple comparison tests

When you conduct an ANOVA, you are attempting to determine if there is a statistically significant difference among the groups that are not related to sampling error.  If you find that there is a difference, you will then need to examine where the groups' differences lay.

At this point you could run post-hoc tests which are t tests examining mean differences between the groups.  There are several multiple comparison tests that can be conducted that will control the type one error rate.

  • If you are concerned about violations of the assumptions use Scheffe’s Test.
  • If you are not concerned about violations to the assumptions and are testing compound and pair wise tests, use Dunn’s test or the modified Bonferroni Test.
  • If you are not concerned with violations of the assumptions and are just comparing the treatment to the control, use Dunnette’s Test.
  • If you are not concerned with violations of the assumptions and are comparing all possible pair wise use Tukey’s Test or modified Tukey’s test.
  • If you are not concerned with violations of the assumptions and are testing more than half of the possible pair wise comparisons again use Tukey or modified Tukey’s Test.
  • If you are not concerned with violations to the assumptions and are testing less than half of the possible pair wise comparisons, use Dunn’s Test or the modified Bonferroni Test.

All of these post-hoc tests will ensure that the Type I error rate remains under control.

Types of Research Questions the ANOVA Examines

One-way ANOVA: Are there differences in GPA by grade level (freshmen vs. sophomores vs. juniors)?

Two-way ANOVA: Are there differences in GPA by grade level (freshmen vs. sophomores vs. juniors) and gender (male vs. female)?

Data Level and Assumption

The variables' data level and test assumptions play an important role in ANOVA.  In ANOVA, the dependent variable can be continuous (interval or ratio) level scale.  The independent variables (sometimes called factor variables) in ANOVA should be categorical (nominal level) variables.  Like the T-test, ANOVA is also a parametric test and has some assumptions.   ANOVA assumes that the distribution of data should be normally distributed.  The ANOVA assumes the assumption of homogeneity, which means that the variance among the groups should be approximately equal.  ANOVA also assumes that the observations are independent of each other.   Researchers should keep in mind when planning any study to look out for extraneous and confounding variables.   ANOVA has methods (i.e., ANCOVA) to control these types of undesirable variable effects.

Testing of the Assumptions

1. The population in which samples are drawn should be normally distributed.
2. Independence of cases: the sample cases should be independent of each other.
3. Homogeneity of variance: Homogeneity means that the variance among the groups should be approximately equal.

These assumptions can be tested using statistical software (like Intellectus Statistics!).  The assumption of homogeneity of variance can be tested using tests such as Levene’s test or the Brown-Forsythe Test.  Normality of the distribution of the scores can be tested using plots, the values of skewness and kurtosis, or using tests such as Shapiro-Wilk or Kolmogorov-Smirnov.  The assumption of independence can be determined from the design of the study.

It is important to note that ANOVA is not robust to violations to the assumption of independence.  This is to say, that even if you violate the assumptions of homogeneity or normality, you can conduct the test and basically trust the findings.  However, violations to independence assumption one cannot trust those ANOVA results.  In general, with violations of homogeneity the study can probably carry on if you have equal sized groups.  With violations of normality, continuing with the ANOVA should be ok if you have a large sample size and equal sized groups.

Effect Size

When conducting an ANOVA it is always important to calculate the effect size.  The effect size tells you the importance of those differences.  Effect sizes can be categorized into small, medium or large.  Cohen cited ,.30 as small, .30-.49 as moderate, and .50 and greater as large effect size.

Things to consider when running an ANOVA

The type of study conducted needs to take into account several considerations.  The researcher should know if the data is arranged as Crossed or Nested.  When the data is Crossed all groups receive all aspects of the treatment or intervention.  This is to say, if you were researching teacher effectiveness through three methods, all of the teachers in the study would receive all of the methods.  By contrast, in a Nested design certain teachers would receive method one, another set would receive method two and a third different set would receive method three.  Not all teachers would get all of the methods.

Related Statistical Tests: MANOVA and ANCOVA

Researchers have extended ANOVA in MANOVA and ANCOVA.   MANOVA stands for the multivariate analysis of variance.  MANOVA is used when there is two or more dependent variables.  ANCOVA is the term for analysis of covariance.  The ANCOVA is used when the researcher includes one or more covariate variables in the analysis.


Algina, J., & Olejnik, S. (2003). Conducting power analyses for ANOVA and ANCOVA in between-subjects designs. Evaluation & the Health Professions, 26(3), 288-314.

Cardinal, R. N., & Aitken, M. R. F. (2006). ANOVA for the behavioural sciences researcher. Mahwah, NJ: Lawrence Erlbaum Associates.

Cortina, J. M., & Nouri, H. (2000). Effect size for ANOVA designs. Thousand Oaks, CA: Sage Publications. Effect Size for ANOVA Designs (Quantitative Applications in the Social Sciences)

Davison, M. L., & Sharma, A. R. (1994). ANOVA and ANCOVA of pre- and post-test, ordinal data. Psychometrika, 59(4), 593-600.

Girden, E. R. (1992). ANOVA repeated measures. Newbury Park, CA: Sage Publications. View

Iverson, G. R., & Norpoth, H. (1987). Analysis of variance. Thousand Oaks, CA: Sage Publications. View

Jackson, S., & Brashers, D. E. (1994). Random factors in ANOVA. Thousand Oaks, CA: Sage Publications. View

Klockars, A. J., & Sax, G. (1986). Multiple comparisons. Newbury Park, CA: Sage Publications. View

Levy, M. S., & Neill, J. W. (1990). Testing for lack of fit in linear multiresponse models based on exact or near replicates. Communications in Statistics - Theory and Methods, 19(6), 1987-2002.

Rutherford, A. (2001). Introducing ANOVA and ANCOVA: A GLM approach. Thousand Oaks, CA: Sage Publications. View

Toothacker, L. E. (1993). Multiple comparisons procedures. Newbury Park, CA: Sage Publications. View

Tsangari, H., & Akritas, M. G. (2004). Nonparametric ANCOVA with two and three covariates. Journal of Multivariate Analysis, 88(2), 298-319.

Turner, J. R., & Thayer, J. F. (2001). Introduction to analysis of variance: Design, analysis, & interpretation. Thousand Oaks, CA: Sage Publications.

Wilcox, R. R. (2005). An approach to ANCOVA that allows multiple covariates, nonlinearity, and heteroscedasticity. Educational and Psychological Measurement, 65(3), 442-450.

Wildt, A. R., & Ahtola, O. T. (1978). Analysis of covariance. Newbury Park, CA: Sage Publications. View

Wright, D. B. (2006). Comparing groups in a before-after design: When t test and ANCOVA produce different results. British Journal of Educational Psychology, 76, 663-675.

Related Pages:


To Reference this Page:  Statistics Solutions. (2013). ANOVA [WWW Document]. Retrieved from