Regression

A regression assesses whether predictor variables account for variability in a dependent variable.  This page will describe regression analysis example research questions, assumptions, the evaluation of the R-square (coefficient of determination), the F-test, the interpretation of the beta coefficient(s), and the regression equation.

Example questions answered by a regression analysis:

Do age and gender predict gun regulation attitudes?

Do the five facets of mindfulness influence peace of mind scores?

Assumptions:

First, regression analysis is sensitive to outliers.  You can identify outliers by standardizing the scores and checking for absolute values higher than 3.29 in the standardized scores. Researchers consider values with absolute scores higher than 3.29 as outliers and remove them from the data.

Second, the main assumptions of regression are normality, homoscedasticity, and absence of multicollinearity.  One can assess normality by examining a normal P-P plot. If the data form a straight line along the diagonal, you can assume normality.  To assess homoscedasticity, the researcher can create a scatterplot of standardized residuals verses standardized predicted values.  If the plot shows random scatter, you meet the assumption.  However, if the scatter forms a cone shape, you violate the assumption.  One can assess multicollinearity by calculating the variance inflation factors (VIFs). VIF values higher than 10 indicates that multicollinearity may be a problem.

F-test

When the regression is conducted, an F-value, and significance level of that F-value, is computed.  If the F-value is statistically significant (typically p < .05), the model explains a significant amount of variance in the outcome variable.

Evaluation of the R-Square

When the regression is conducted, an R2 statistic (coefficient of determination) is computed.  One can interpret R² as the percentage of variance in the outcome variable explained by the set of predictor variables.

Evaluation of the Adjusted R-Square

The adjusted R2 value is calculation of the R2 that is adjusted based on the number of predictors in the model.

Beta Coefficients

After the evaluation of the F-value and R2, it is important to evaluate the beta coefficients.  The beta coefficients can be negative or positive, and have a t-value and significance of the t-value associated with each.  The beta coefficient is the degree of change in the outcome variable for every 1-unit of change in the predictor variable.  The t-test assesses whether the beta coefficient is significantly different from zero.  When the beta coefficient is significant, examine the sign of the beta.

A positive beta coefficient indicates that for every 1-unit increase in the predictor variable, the outcome variable will increase by the beta coefficient value. Conversely, a negative beta coefficient means that for every 1-unit increase in the predictor variable, the outcome variable will decrease by the beta coefficient value. When the beta coefficient is not statistically significant (i.e., the t-value is not significant), the variable does not significantly predict the outcome.  For example, if the beta coefficient is .80 and I statistically significant, then for each 1-unit increase in the predictor variable, the outcome variable will increase by .80 units.

Equation

Once the beta coefficient is determined, then a regression equation can be written.  Using the example and beta coefficient above, the equation can be written as follows:

y= 0.80x + c, where y is the outcome variable, x is the predictor variable, 0.80 is the beta coefficient, and c is a constant.