# Autocorrelation

Autocorrelation refers to the degree of correlation between the values of the same variables across different observations in the data.  The concept of autocorrelation is most often discussed in the context of time series data in which observations occur at different points in time (e.g., air temperature measured on different days of the month).  For example, one might expect the air temperature on the 1st day of the month to be more similar to the temperature on the 2nd day compared to the 31st day.  If the temperature values that occurred closer together in time are, in fact, more similar than the temperature values that occurred farther apart in time, the data would be autocorrelated.

However, autocorrelation can also occur in cross-sectional data when the observations are related in some other way.  In a survey, for instance, one might expect people from nearby geographic locations to provide more similar answers to each other than people who are more geographically distant.  Similarly, students from the same class might perform more similarly to each other than students from different classes.  Thus, autocorrelation can occur if observations are dependent in aspects other than time.  Autocorrelation can cause problems in conventional analyses (such as ordinary least squares regression) that assume independence of observations.

In a regression analysis, autocorrelation of the regression residuals can also occur if the model is incorrectly specified.  For example, if you are attempting to model a simple linear relationship but the observed relationship is non-linear (i.e., it follows a curved or U-shaped function), then the residuals will be autocorrelated.

How to Detect Autocorrelation

A common method of testing for autocorrelation is the Durbin-Watson test.  Statistical software such as SPSS may include the option of running the Durbin-Watson test when conducting a regression analysis.  The Durbin-Watson tests produces a test statistic that ranges from 0 to 4.  Values close to 2 (the middle of the range) suggest less autocorrelation, and values closer to 0 or 4 indicate greater positive or negative autocorrelation respectively.