Correlation in SPSS

Posted March 16, 2009

Correlation is a statistical technique that shows how strongly two variables are related to each other or the degree of association between the two. For example, if we have the weight and height data of taller and shorter people, with the correlation between them, we can find out how these two variables are related. We can also find the correlation between these two variables and say that their weights are positively related to height. Correlation is measured by the correlation coefficient. It is very easy to calculate the correlation coefficient in SPSS. Before calculating the correlation in SPSS, we should have some basic knowledge about correlation. The correlation coefficient should always be in the range of -1 to 1. There are three types of correlation:

1. Positive and negative correlation: When one variable moves in the same direction, then it is called positive correlation. When one variable moves in a positive direction, and a second variable moves in a negative direction, then it is said to be negative correlation.

2. Linear and non linear or curvi-linear correlation: When both variables change at the same ratio, they are known to be in linear correlation. When both variables do not change in the same ratio, then they are said to be in curvi-linear correlation. For example, if sale and expenditure move in the same ratio, then they are in linear correlation and if they do not move in the same ratio, then they are in curvi-linear correlation.

3. Simple, partial and multiple correlations: When two variables in correlation are taken in to study, then it is called simple correlation. When one variable is a factor variable and with respect to that factor variable, the correlation of the variable is considered, then it is a partial correlation. When multiple variables are considered for correlation, then they are called multiple correlations.

Degree of correlation

1. Perfect correlation: When both the variables change in the same ratio, then it is called perfect correlation.

2. High degree of correlation: When the correlation coefficient range is above .75, it is called high degree of correlation.

3. Moderate correlation: When the correlation coefficient range is between .50 to .75, it is called in moderate degree of correlation.

4. Low degree of correlation: When the correlation coefficient range is between .25 to .50, it is called low degree of correlation.

5. Absence of correlation: When the correlation coefficient is between . 0 to .25, it shows that there is no correlation.

There are many techniques to calculate the correlation coefficient, but in correlation in SPSS there are four methods to calculate the correlation coefficient. For continuous variables in correlation in SPSS, there is an option in the analysis menu, bivariate analysis with Pearson correlation. If data is in rank order, then we can use Spearman rank correlation. This option is also available in SPSS in analyses menu with the name of Spearman correlation. If data is Nominal then Phi, contingency coefficient and Cramer’s V are the suitable test for correlation. We can calculate this value by requesting SPSS in cross tabulation. Phi coefficient is suitable for 2×2 table. Contingency coefficient C is suitable for any type of table.

Testing the Significance of a Correlation:

Once we compute the correlation coefficient, then we will determine the probability that observed correlation occurred by chance. For that, we have to conduct a significance test. In significance testing we are mostly interested in determining the probability that correlation is the real one and not a chance occurrence. For this we determine hypothesis. There are two types of hypothesis.

Null hypothesis: In Null hypothesis we assume that there is no correlation between the two variables.

Alternative hypothesis: In alternative hypothesis we assume that there is a correlation between variables.

Before testing the hypothesis, we have to determine the significance level. In most of the cases, it is assumed as .05 or .01. At 5% level of significance, it means that we are conducting a test, where the odds are the case that the correlation is a chance occurrence is no more than 5 out of 100. After determining the significance level, we calculate the correlation coefficient value. The correlation coefficient value is determined by ‘r’ sign.

Coefficient of determination:

With the help of the correlation coefficient, we can determine the coefficient of determination. Coefficient of determination is simply the variance that can be explained by X variable in y variable. If we take the square of the correlation coefficient, then we will find the value of the coefficient of determination.