# Multiple Regression

Multiple regression generally explains the relationship between multiple independent or predictor variables and one dependent or criterion variable.  A dependent variable is modeled as a function of several independent variables with corresponding coefficients, along with the constant term.  Multiple regression requires two or more predictor variables, and this is why it is called multiple regression.

The multiple regression equation explained above takes the following form:

y = b1x1 + b2x2 + … + bnxn + c.

Here, bi’s (i=1,2…n) are the regression coefficients, which represent the value at which the criterion variable changes when the predictor variable changes.

As an example, let’s say that the test score of a student in an exam will be dependent on various factors like his focus while attending the class, his intake of food before the exam and the amount of sleep he gets before the exam.  Using this test one can estimate the appropriate relationship among these factors.

Multiple regression in SPSS is done by selecting “analyze” from the menu.  Then, from analyze, select “regression,” and from regression select “linear.”

• There should be proper specification of the model in multiple regression.  This means that only relevant variables must be included in the  model and the model should be reliable.
• Linearity must be assumed; the model should be linear in nature.
• Normality must be assumed in multiple regression.  This means that in multiple regression, variables must have normal distribution.
• Homoscedasticity must be assumed; the variance is constant across all levels of the predicted variable.

There are certain terminologies that help in understanding multiple regression.  These terminologies are as follows:

• The beta value is used in measuring how effectively the predictor variable influences the criterion variable, it is measured in terms of standard deviation.
• R, is the measure of association between the observed value and the predicted value of the criterion variable.  R Square, or R2, is the square of the measure of association which indicates the percent of overlap between the predictor variables and the criterion variable.  Adjusted R2 is an estimate of the R2 if you used this model with a new data set.

Statistics Solutions consists of a team of professional methodologists and statisticians that can assist the student or professional researcher in administering the survey instrument, collecting the data, conducting the analyses and explaining the results.

Resources

Achen, C. H. (1982). Interpreting and using regression. Newbury Park, CA: Sage Publications.

Afifi, A. A., Kotlerman, J. B., Ettner, S. L., & Cowan, M. (2007). Methods for improving regression analysis for skewed continuous or counted responses. Annual Review of Public Health, 28, 95-111.

Aguinis, H. (2004). Regression analysis for categorical moderators. New York: Guilford Press.

Algina, J., & Olejnik, S. (2003). Sample size tables for correlation analysis with applications in partial correlation and multiple regression analysis. Multivariate Behavioral Research, 38(3), 309-323.

Allison, P. D. (1999). Multiple regression. Thousand Oaks, CA: Pine Forge Press.

Anderson, E. B. (2004). Latent regression analysis based on the rating scale model. Psychological Science, 46(2), 209-226.

Belsley, D. A., Kuh, E., & Welsch, R. E. (1980). Regression diagnostics: Identification influential data and sources of collinearity.New York: John Wiley & Sons.

Berk, R. A. (2003). Regression analysis: A constructive critique. Thousand Oaks, CA: Sage Publications.

Berry, W. D. (1993). Understanding regression assumptions. Newbury Park, CA: Sage Publications.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.

Fox, J. (1991). Regression diagnostics. Newbury Park, CA: Sage Publications.

Fox, J. (2000a). Nonparametric simple regression: Smoothing scatterplots. Thousand Oaks, CA: Sage Publications.

Fox, J. (2000b). Multiple and generalized nonparametric regression. Thousand Oaks, CA: Sage Publications.

Hardy, M. A. (1993). Regression with dummy variables. Newbury Park, CA: Sage Publications.

Jaccard, J. (2001). Interaction effects in logistic regression. Thousand Oaks, CA: Sage Publications.

Kahane, L. H. (2001). Regression basics. Thousand Oaks, CA: Sage Publications.

Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications.

Miles, J., & Shevlin, M. (2001). Applying regression and correlation: A guide for students and researchers. Thousand Oaks, CA: Sage Publications.

Pedhazur, E. J. (1997). Multiple regression in behavioral research (3rd ed.). Fort Worth, TX: Harcourt Brace.

Schroeder, L. D., Sjoquist, D. L., & Stephan, P. E. (1986). Understanding regression analysis: An introductory guide. Newbury Park, CA: Sage Publications.

Serlin, R. C., & Harwell, M. R. (2004). More powerful tests of predictor subsets in regression analysis under nonnormality. Psychological Methods, 9(4), 492-509.

Related Pages: