Nominal variable association refers to the statistical relationship(s) on nominal variables. Nominal variables have no inherent ranking, and researchers measure them at the nominal level. Social science studies commonly assess nominal variables like gender, race, religious affiliation, and college major. They commonly use crosstabulation (or contingency tables) to examine relationships between nominal variables. They widely use Chi-Square tests of independence to assess relationships between two independent nominal variables..
Does a relationship exist between graduation intent and gender?
Is there an association between music genre selection and venue type?
Crosstabulation shows whether being in one category of the independent variable makes a case more likely to be in a particular category of the dependent variable. This approach allows researchers to examine the association between two categorical variables. For example, using the variables above, crosstabulation can explore the relationship between race and college major. Specifically, it can assess whether African American students are more likely to major in business or if Hispanic students are more likely to major in the natural sciences.
Researchers examine patterns of association by comparing observed frequencies across table rows to calculated expected frequencies, assuming the independent variable is in the columns and the dependent variable is in the rows. A Chi-Square Test of Independence helps researchers determine whether the relationship observed in a sample likely exists in the population. However, this test may not be appropriate if the sample size is not sufficient.
Several measures assess the strength of association between nominal variables. They resemble Pearson’s correlation with defined bounds, providing a standard way to describe associations.
Each measure uses the Chi-Square value from the crosstabulation table. They calculate the contingency coefficient as follows:
This measure ranges between 0 and 1, with values closer to 1 indicating a stronger association between the variables. The CC is highly sensitive to table size, so researchers should interpret it with caution.
A general rule of thumb for interpreting the strength of associations is:
< .10 = weak
.11 – .30 = moderate
> .31 = strong
Contingency Coefficient
A measure of association used for 2 x 2 tables is the Phi coefficient:
Again, the measure ranges between 0 and 1 with higher values meaning a stronger association.
When the crosstabulation table is larger than 2 x 2, Cramer’s V is the best choice:
Here, N is the sample size and k is the smaller of the number of rows or columns (so it would be 3 for a 3 x 4 table).
Unlike Chi-Square measures, Lambda is a PRE measure showing the variance explained by the independent variable. Another way to say this is: how much better is our guess about which category of the dependent variable each case will fall into if we know the case’s value on the independent variable? They often represent lambda as a percentage, much like R-squared in regression, which is also a PRE measure. Lambda is a directional measure because the calculation changes based on which variable researchers treat as the independent variable.
All of the above measures of association are available by clicking on the Statistics button when requesting crosstabulations in SPSS. Lambda is calculated in both directions, treating each variable as independent. The other three measures are symmetric, meaning that it does not matter which variable is treated as independent
Adequate sample size for each of the categories being analyzed.
Variables must be categorical.
If a zero is present in the crosstabulation, no association can be assessed.
Related analyses: