# Autocorrelation

Autocorrelation is a characteristic of data in which the correlation between the values of the same variables is based on related objects.  It violates the assumption of instance independence, which underlies most of the conventional models.  It generally exists in those types of data-sets in which the data, instead of being randomly selected, is from the same source.

Presence

The presence of autocorrelation is generally unexpected by the researcher.  It occurs mostly due to dependencies within the data.  Its presence  is a strong motivation for those researchers who are interested in relational learning and inference.

Examples

In order to understand autocorrelation, we can discuss some instances that are based upon cross sectional and time series data.  In cross sectional data, if the change in the income of a person A affects the savings of person B (a person other than person A), then autocorrelation is present.  In the case of time series data, if the observations show inter-correlation, specifically in those cases where the time intervals are small, then these inter-correlations are given the term of autocorrelation.

In time series data, autocorrelation is defined as the delayed correlation of a given series.  Autocorrelation is a delayed correlation by itself, and is delayed by some specific number of time units.  On the other hand, serial autocorrelation is that type which defines the lag correlation between the two series in time series data.

Patterns

Autocorrelation depicts various types of curves which show certain kinds of patterns, for example, a curve that shows a discernible pattern among the residual errors, a curve that shows a cyclical pattern of upward or downward movement, and so on.

In time series, it generally occurs due to sluggishness or inertia within the data.  If a non-expert researcher is working on time series data, then he might use an incorrect functional form, and this again can cause autocorrelation.

The handling of the data by the researcher, when it involves extrapolation and interpolation, can also give rise to autocorrelation.  Thus, one should make the data stationary in order to remove autocorrelation in the handling of time series data.

Autocorrelation is a matter of degree, so it can be positive as well as negative.  If the series (like an economic series) depicts an upward or downward pattern, then the series is considered to exhibit positive autocorrelation.  If, on the other hand, the series depicts a constant upward and downward pattern, then the series is considered to exhibit negative autocorrelation.

When a researcher has applied ordinary least square over an estimator in the presence of autocorrelation, then the estimator is incompetent.

Detecting the Presence

There is a very popular test called the Durbin Watson test that detects the presence of autocorrelation.  If the researcher detects autocorrelation in the data, then the first thing the researcher should do is to try to find whether or not it is pure.  If it is pure, then one can transform it into the original model that is free from pure autocorrelation.