Phi, Contingency Coefficient, Tschuprow’s T, Cramer’s V, Lambda, Uncertainty Coefficient
Association refers to the coefficient that measures the strength of the relation between variables. In this section, we will discuss the coefficient of nominal association. Phi, Contingency Coefficient, Tschuprow’s T, Cramer’s V, Lambda, and Uncertainty Coefficient are the coefficients that measure the association for nominal variables. In this measure of association, Phi, Contingency Coefficient, Tschuprow’s T and Cramer’s V are based on the adjusted chi-square significance.
Phi: Phi is a nominal association that is based on the adjusted chi-square. Phi is a symmetrical measure. We know that chi-square association is based on the strength of a relationship and the sample size. In calculating the phi, we divide chi-square by the sample size. Phi coefficient is also called Pearson’s coefficient of mean-square contingency. Phi coefficient is used for a 2×2 table and a nominal variable. In SPSS, we can calculate phi by using the “crosstab” option from the analysis menu, and selecting “phi” from the crosstab option. Testing of significance of Phi is the same as chi-square. The following formula is used to calculate the value of the Phi coefficient:

Contingency Coefficient: The contingency coefficient is also the nominal association which is based on the adjusted chi-square. The contingency coefficient is used when the table is greater than 2×2. Nominal data is used for the contingency coefficient. The value of the contingency coefficient never reaches 1. Many researchers recommended that more than a 5×5 table should be used to calculate the value of the contingency coefficient. The contingency coefficient is a symmetrical measure, so it does not matter which one is the dependent variable and which one is the independent variable. The upper limit of the coefficient in the contingency coefficient is .71. In SPSS, the contingency coefficient is also available in the “crosstab” option. In order to get the contingency coefficient, select “statistics” from the crosstab option, and then select “contingency coefficient.” The following formula is used to calculate the value of the contingency coefficient:

Where,
C= contingency coefficient.
= chi-square
Tshuprow’s T: Tshuprow’s T is also a nominal association that is based on the adjusted chi-square. The maximum value of Tshuprow’s T is reached by 1.In a 2×2 table, Tshuprow’s T is equal to the phi. Nominal data is used to calculate the coefficient of Tshuprow’s T. The significance of the Tshuprow’s T is the same as chi-square. Tshuprow’s T is a symmetrical measure, so it does not matter which one is the dependent variable and which one is the independent variable. SPSS does not support Tshuprow’s T measure. The following formula is used to calculate the value of Tshuprow’s T coefficient:
Tshuprow’s T= 
Where,
r= row
c= column
Cramer’s V: Cramer’s V is the most popular nominal association that is used for a 2×2 table. Cramer’s V is also an adjusted test to the chi-square test. The value of the Cramer’s V varies and is between 0 and 1. Cramer’s V is also a symmetrical measure. Significance testing of the Cramer’s V is the same as the chi-square test. The value of the Cramer’s V can be reached at 1 if two variables have an equal marginal. In SPSS, Cramer’s V is available in the “crosstab” option. Click on the “statistics” button form the crosstab option and select “Cramer’s V test.” The value of Cramer’s V is always less than the phi coefficient. The following formula is used to calculate the value of Cramer’s V:
Where v is Cramer’s V and n and m are the sample size and time.
Lambda: Lambda is also called Goodman-Kruskal lambda. Lambda is also a nominal measure. Lambda is proportion reduction in error measure, which means that its value reflects the percentage of the reduction in error in predicating the dependent variable with the knowledge of the independent variable. The following formula is used to calculate the value of Lambda:
Where f is the largest frequency for the i classes of the independent variable, Fd is the largest marginal value of the dependent variable, and n is the sample size.
Uncertainty Coefficient: Uncertainty coefficient is also called Theil’s U or entropy coefficient, and it varies from 0 to 1. Like the lambda, the uncertainty coefficient is also used to reduce error in the measurement of the dependent variable. Nominal data is considered for uncertainty coefficient. Uncertainty coefficient is an asymmetric measure. The formula of uncertainty coefficient based on the row and column is defined as:

Where,
X= Column variable
Y= Row variable
n= Sample size
rj = Row totals (marginal’s) for rows 1…j
ck = Column totals (marginal’s) for rows 1…k
njk = Cell count for row j, column k
ln = Natural log function
H(X) = – SUMj[(rj/n)*ln(rj/n)] = entropy for UC(C|R)
H(Y) = – SUMk[(ck/n)*ln(ck/n)] = entropy for UC(R|C)
H(XY) = – SUMjSUMk[(njk/n)*ln(njk/n)]


