# ANOVA (Analysis of Variance)

ANOVA is a statistical method that stands for analysis of variance.  ANOVA was developed by Ronald Fisher in 1918 and is the extension of the t and the z test.  Before the use of ANOVA, the t-test and z-test were commonly used.  But the problem with the T-test is that it cannot be applied for more than two groups.  In 1918, Ronald Fisher developed a test called the analysis of variance.  This test is also called the Fisher analysis of variance, which is used to do the analysis of variance between and within the groups whenever the groups are more than two.  If you set the Type one error to be .05, and you had several groups, each time you tested a mean against another there would be a .05 probability of having a type one error rate.  This would mean that with six T-tests you would have a 0.30 (.05×6) probability of having a type one error rate.  This is much higher than the desired .05.

ANOVA creates a way to test several null hypothesis at the same time.

The logic behind this procedure has to do with how much variance there is in the population.  It is likely he researcher will not know the actual variance in the population but they can estimate this by sampling and calculating the variance in the sample.  You compare the differences in the samples to see if they are the same or statistically different while still accounting for sampling error.

General Purpose of ANOVA

These days, researchers are using ANOVA in many ways.  The use of ANOVA depends on the research design.  Commonly, researchers are using ANOVA in three ways: one-way ANOVA, two-way ANOVA, and N-way Multivariate ANOVA.

One-Way:

When we compare more than two groups, based on one factor (independent variable), this is called one way ANOVA.  For example, it is used if a manufacturing company wants to compare the productivity of three or more employees based on working hours.  This is called one way ANOVA.

Two-Way:

When a company wants to compare the employee productivity based on two factors (2 independent variables), then it said to be two way (Factorial) ANOVA.  For example, based on the working hours and working conditions, if a company wants to compare employee productivity, it can do that through two way ANOVA.  Two-way ANOVA’s can be used to see the effect of one of the factors after controlling for the other, or it can be used to see the INTERACTION between the two factors.  This is a great way to control for extraneous variables as you are able to add them to the design of the study.

Factorial ANOVA can be balanced or unbalanced.  This is to say, you can have the same number of subjects in each group (balanced) or not (unbalanced).  This can come about, depending on the study, as just a reflection of the population, or an unwanted event such as participants not returning to the study.  Not having equal sizes groups can make it appear that there is an effect when this may not be the case.  There are several procedures a researcher can do in order to solve this problem:

• Conduct a special kind of ANOVA which can deal with the unbalanced design

There are three types of ANOVA’s that can candle an unbalanced design.  These are the Classical Experimental design (Type 2 analysis),  the Hierarchical Approach (Type 1 analysis), and the Full regression approach (Type 3 analysis).  Which approach to use depends on whether the unbalanced data occurred on purpose.

-If the data is unbalanced because this is a reflection of the population and it was intended, use the Full Regression approach (Type 3).

-If the data was not intended to be unbalanced but you can argue some type of hierarchy between the factors, use the Hierarchical approach (Type 1).

-If the data was not intended to be unbalanced and you cannot find any hierarchy, use the classical experimental approach (Type 2).

N-Way:

When the factor comparison is taken, then it said to be n-way ANOVA.  For example, in productivity measurement if a company takes all the factors for productivity measurement, then it is said to be n-way ANOVA.

ANOVA is used very commonly in business, medicine or in psychology research.  In business, ANOVA can be used to compare the sales of different designs based on different factors.  A psychology researcher can use ANOVA to compare the different attitude or behavior in people and whether or not they are the same depending on certain factors.  In medical research, ANOVA is used to test the effectiveness of a drug.

General Purpose - Procedure

Omnibus ANOVA test:

In an ANOVA, a researcher first sets up the null and alternative hypothesis.  The null hypothesis assumes that there is no significant difference between the groups.  The alternative hypothesis assumes that there is a significant difference between the groups.  After cleaning the data, the researcher must test the above assumptions and see if the data meets them.  They must then do the necessary calculation and calculate the F-ratio.  After this, the researcher must compare the critical value of the F-ratio with the table value or simply look at the p value against the established alpha.  If the calculated critical value is greater than the table value, the null hypothesis will be rejected and the alternative hypothesis is accepted.  Rejecting the null hypothesis, we will conclude that the mean of the groups are not equal.  If the calculated value is less than the table value, we will accept the null hypothesis and reject the alternative hypothesis.  This will tell you that there is a difference in what you were testing, but does not tell you WHERE the difference is.  This is to say, if the researcher was testing several groups against one another, they would know that there is a difference between the means of the groups but not which individual groups are different.  In order to know where the difference lays further testing must be done.

So what if you find statistical significance?  Multiple comparison tests

When you run an ANOVA, you are attempting to determine if there is a statistically significant difference between the groups that is not related to sampling error.  If you find that there is a difference, you will then need to see between which of the groups the difference lays.  This is to say, all groups might be different, or perhaps only one of four groups is statistically different from the others.

At this point you could run several t tests to test the means between the groups, but this would not control for error as again you would be testing for several hypothesis at the same time.  There are several multiple comparison tests that can be conducted that will control the type one error rate.

• If you are concerned about violations of the assumptions use Scheffe’s Test.
• If you are not concerned about violations to the assumptions and are testing compound and pair wise tests, use Dunn’s test or the modified Bonferroni Test.
• If you are not concerned with violations of the assumptions and are just comparing the treatment to the control, use Dunnette’s Test.
• If you are not concerned with violations of the assumptions and are comparing all possible pair wise use Tukey’s Test or modified Tukey’s test.
• If you are not concerned with violations of the assumptions and are testing more than half of the possible pair wise comparisons again use Tukey or modified Tukey’s Test.
• If you are not concerned with violations to the assumptions and are testing less than half of the possible pair wise comparisons, use Dunn’s Test or the modified Bonferroni Test.

All of these tests will ensure that the Type I error rate remains under control as was established by the researcher and will tell you exactly which groups are different from one another.

One-way ANOVA: Are there differences in GPA by grade level (freshmen vs. sophomores vs. juniors)?

Are there differences in the profit of the Fortune 500 companies by highest educational degree attained by the CEO (B.A./B.S vs. master’s vs. doctorate)?

Two-way ANOVA: Are there differences in GPA by grade level (freshmen vs. sophomores vs. juniors) and school district (district one vs. district two)?

Are there differences in the profit of the Fortune 500 companies by highest educational degree attained by the CEO (B.A./B.S vs. master’s vs. doctorate) and number of employees (<100 vs. 100 - 500 vs. > 500)?

Data Level and Assumption

Data level and assumption plays a very important role in ANOVA.  In ANOVA, the dependent variable can be continuous or on the interval scale.  Factor variables in ANOVA should be categorical.  Like the T-test, ANOVA is also a parametric test and has some assumptions, which should be met to get the desired results.  ANOVA assumes that the distribution of data should be normally distributed.  ANOVA also assumes the assumption of homogeneity, which means that the variance between the groups should be equal. ANOVA also assumes that the cases are independent to each other or there should not be any pattern between the cases.  As usual, when planning any study, extraneous and confounding variables need to be considered.  ANOVA is a way to control these types of undesirable variables.

Testing of the Assumptions

1. The population in which samples are drawn should be normally distributed.
2. Independence of cases: the sample cases should be independent of each other.
3. Homogeneity: Homogeneity means that the variance between the groups should be approximately equal.

These assumptions can be tested using statistical software.  The assumption of homogeneity of variance can be tested using tests such as Levene’s test or the Brown-Forsythe Test.  Normality of the distribution of the population can be tested using plots, the values of skeweness and kurtosis, or using tests such as Shpiro-Wilk or Kolmogorov-Smirnov.  The assumption of independence can be determined from the design of the study.

It is important to note that ANOVA is not robust to violations to the assumption of independence.  This is to say, that even if you violate the assumptions of homogeneity or normality, you can conduct statistical procedures that will still enable you to conduct the ANOVA but you cannot with violations to independence.  In general, with violations of homogeneity the study can probably carry on if you have equal sized groups.  With violations of normality, continuing with the ANOVA should be ok if you have a large sample size and equal sized groups.

Effect Size

When conducting an ANOVA it is always important o calculate the effect size.   The effect size can tell you the degree to which the null hypothesis is false.  Effect sizes can be considered small, medium or large.  A medium effect size is one that is noticeable to the laypersons eye.  Please refer to Cohen’s tables to determine what the value for a small, medium or large effect size would be for an ANOVA.  If from running an ANOVA you determine that you do not have statistically significantly different groups, but you have a large effect size, you might want to rerun this ANOVA with a larger sample size.  A large effect size without statistical significance could be an indication that significance can be reached with a larger sample.

Things to consider when running an ANOVA

The type of study to be conducted needs to take into considerations several aspects of the analysis.  The researcher should know if the data is set up as Crossed or Nested. When the data is Crossed all groups receive all aspects.  This is to say, if you were researching teacher effectiveness through three methods, all of the teachers in the study would receive all of the methods.  By contrast, in a Nested design certain teachers would receive method one, another set would receive method two and a third different set would receive method three.  Not all teachers would get all of the methods.

Related Statistical Tests

These days, researchers have extended ANOVA in MANOVA and ANCOVA.  MANOVA stands for the multivariate analysis of variance.  MANOVA is used when the dependent variable in ANCOVA are two or more than two.  ANCOVA stands for analysis of covariance.  ANCOVA is used when the researcher includes one or more covariate variables in the independent variable.

Resources

Algina, J., & Olejnik, S. (2003). Conducting power analyses for ANOVA and ANCOVA in between-subjects designs. Evaluation & the Health Professions, 26(3), 288-314.

Cardinal, R. N., & Aitken, M. R. F. (2006). ANOVA for the behavioural sciences researcher. Mahwah, NJ: Lawrence Erlbaum Associates.

Cortina, J. M., & Nouri, H. (2000). Effect size for ANOVA designs. Thousand Oaks, CA: Sage Publications. Effect Size for ANOVA Designs (Quantitative Applications in the Social Sciences)

Davison, M. L., & Sharma, A. R. (1994). ANOVA and ANCOVA of pre- and post-test, ordinal data. Psychometrika, 59(4), 593-600.

Girden, E. R. (1992). ANOVA repeated measures. Newbury Park, CA: Sage Publications. View

Iverson, G. R., & Norpoth, H. (1987). Analysis of variance. Thousand Oaks, CA: Sage Publications. View

Jackson, S., & Brashers, D. E. (1994). Random factors in ANOVA. Thousand Oaks, CA: Sage Publications. View

Klockars, A. J., & Sax, G. (1986). Multiple comparisons. Newbury Park, CA: Sage Publications. View

Levy, M. S., & Neill, J. W. (1990). Testing for lack of fit in linear multiresponse models based on exact or near replicates. Communications in Statistics - Theory and Methods, 19(6), 1987-2002.

Rutherford, A. (2001). Introducing ANOVA and ANCOVA: A GLM approach. Thousand Oaks, CA: Sage Publications. View

Toothacker, L. E. (1993). Multiple comparisons procedures. Newbury Park, CA: Sage Publications. View

Tsangari, H., & Akritas, M. G. (2004). Nonparametric ANCOVA with two and three covariates. Journal of Multivariate Analysis, 88(2), 298-319.

Turner, J. R., & Thayer, J. F. (2001). Introduction to analysis of variance: Design, analysis, & interpretation. Thousand Oaks, CA: Sage Publications.

Wilcox, R. R. (2005). An approach to ANCOVA that allows multiple covariates, nonlinearity, and heteroscedasticity. Educational and Psychological Measurement, 65(3), 442-450.

Wildt, A. R., & Ahtola, O. T. (1978). Analysis of covariance. Newbury Park, CA: Sage Publications. View

Wright, D. B. (2006). Comparing groups in a before-after design: When t test and ANCOVA produce different results. British Journal of Educational Psychology, 76, 663-675.

Related Pages: