Odd One Out: Understanding Outliers

Quantitative Methodology

Outliers are values that are abnormally distant from most other values in your dataset. While they can give you valuable insights, they also affect assumption testing and inferential statistics. The removal of outliers is a controversial topic, but most parametric analyses are particularly sensitive to outliers that may unduly influence results.

request a consultation

Discover How We Assist to Edit Your Dissertation Chapters

Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.

  • Bring dissertation editing expertise to chapters 1-5 in timely manner.
  • Track all changes, then work with you to bring about scholarly writing.
  • Ongoing support to address committee feedback, reducing revisions.

For example, suppose you were measuring the temperature of several rooms. One room temperature was 75°F, and others were 72°F, 78°F, and 74°F. The mean room temperature would be 74.75 °F. However, suppose you measured a room that was an industrial kitchen, at an uncomfortable 110 degrees. That would move the average up to 81.80°F. Or, perhaps you accidentally entered in a value of 10 instead of 110 for that industrial kitchen. That would move the average to 61.8°F! Despite that the majority of rooms in your sample were comfortably in the 70s, the one extreme value pushed the mean to a level not really representative of your sample.

An outlier can be visually assessed, through scatterplots, boxplots, or histograms. Another preferred way to detect outliers is to create standardized (Z) scores for the variables of interest, and then examine those scores for values that are more than 3.29 standard deviations above or below the mean. If they are beyond ± 3.29 standard deviations from the mean, they are indicative of an associated outlying value (Tabachnick & Fidell, 2013).

Once you have detected your outliers, you should examine them for data entry or measurement errors, such as in the case of the 10°F industrial kitchen. All outliers due to entry or measurement errors should be dropped. If it is not obviously an error, you should examine whether the presence of the outlier creates a significant association, or if it does not change significance but does change the assumption testing results. If does either of those things, you may consider dropping the outlier.

References:

Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics, 6th ed. Boston: Allyn and Bacon.