May 22, 2012

Logistic Regression Assumptions

Logistic regressions, by design, overcome many of the restrictive assumptions of linear regressions.  For example, linearity, normality and equal variances are not assumed, nor is it assumed that the error term variance is normally distributed.

The major assumption is that the outcome must be discrete, otherwise explained as, the dependent variable should be dichotomous in nature; in the current study the outcome variable is risk of dementia (yes vs. no), which is discrete or dichotomous with two levels.  There should be no outliers in the data, which can be achieved by converting the independent variables to a standardized z score and anything at 3.29 or greater can be deleted (Tabachnick and Fidell, 2001).

The absence of multicollinearity will be evaluated by conducting correlations among independent predictor variables.  Tabachnick and Fidell (2001) suggest that as long correlation coefficients among independent variables are less than 0.90 the assumption is met.

Also, there should be a linear relationship between the odd ratio and the independent variable.  Linearity with an ordinal or interval independent variable and the odd ratio can be checked by creating a new variable that divides the existing independent variable into categories of equal intervals and running the same regression on these newly categorized versions as categorical variables.  Linearity is demonstrated if the b coefficients should increase or decrease in linear steps (Garson, 2009).

Finally, a larger sample is recommended in fitting with the maximum likelihood method; using discrete variables requires that there are enough responses in each category. There is sufficient sample size to accommodate this.

References:

Garson, G. D. (2009). Logistic Regression. Retrieved on August 12, 2009 from http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm

Tabachnick, B. G. & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston, MA: Allyn and Bacon. View

Related Pages: