Posted August 14, 2017

Regression analysis is one of the most common and simple to grasp analyses, but also can be one of the most concise and elegant analyses if used correctly. In our previous blog on regressions, we discussed some of the reasons that one might use a regression and how to choose the best model. Today, we are going to talk over some of the important points in terms of interpreting the output.

As you know from the last entry on regressions, the whole purpose of this analysis is to use one or more variables to predict an outcome. Today, we will focus on the linear regression, starting with checking the overall outcome. This step is the most important part of the analysis, and it is actually calculated using the *F *test, just like ANOVA. Because of this, you will see the output in terms of an *F *value. To put the *F *value in perspective, we have to give some detail about the analysis, and we do this using degrees of freedom. All you really need to know about degrees of freedom (*df*) is that the first value of *df *reflects the number of predictors, and the second value of *df* reflects the sample size. In most research, these are presented in parentheses after the *F *value. Luckily, just about any analytic software out there will interpret your *F *in the context of your *df *for you, meaning it gives you a *p *value right away. If this *p *value is lower than your significance cutoff (usually .05), you know you have a good regression, meaning it is able to use one or more of your predictors to calculate an estimate for your outcome!

Once you’ve established a nice significant model, the next step is to look at your details. The most overarching detail is the *R ^{2}*. This will always range from 0 to 1, and can only be positive. You can multiply this number by 100 to get a percentage explaining how much of the variability in your participants’ outcome scores are explained by your predictors. But keep in mind that this number does not have any meaning unless the regression is significant! This outcome comes in two flavors, natural and adjusted; the natural

Once you know everything you could possibly want to know about the overall regression, it is time to dig into your predictors. Each predictor has a corresponding *p *value, which is different from the overall regression’s *p *value. If a predictor is significant, you can start making some claims about it. A simple outcome to look at here is the standardized beta (β). This tells you how strong the relationship between the predictor and outcome are after controlling for everything else in the model. It can range from -1 to 1, where (__+__1) is the strongest, and the sign simply indicates whether there is a positive or negative association. However, another important output can be found from the unstandardized beta (*B*). This value gives you the slope between the predictor and outcome. We previously talked a little about how these values work for binary predictors here, and continuous predictors are pretty similar. For these, a single unit increase in the predictor corresponds with an increase (for positive *B*) or decrease (for negative *B*) corresponding with the *B *value.

For example, if the *number of hours studying *predictor were significant in a regression predicting GPA, a *B *of 1.5 for this variable would mean that each additional hour spent studying corresponds with a 1.5-point higher GPA. Conversely, if the *number of classes skipped* were significant and had a *B* of -0.9, for every class a student skips, we would expect their GPA to decrease by 0.9 points. Using information like this ultimately fulfills the goal of the regression analysis – if we know how long a student spends studying and how many classes they have skipped, we can add or subtract to the average GPA to take a stab at predicting that particular student’s GPA.