Posted February 5, 2020

Anyone familiar with structural equation modeling (SEM) will know that there are a myriad of measures and indices that researchers may use to evaluate the fit of a model. Here, we will briefly highlight one particular assessment of model fit: the chi-square (χ^{2}) test.

The chi-square test is unique among possible the measures of fit in SEM because it is a test of statistical significance. The chi-square value and model degrees of freedom can be used to calculate a *p*-value (done automatically by most SEM software). This tests the null hypothesis that the predicted model and observed data are equal. Because you want your predictions to match the actual data as closely as possible, you do not want to reject this null hypothesis. In other words, a nonsignificant result for this test indicates good model fit.

This differs from virtually all other measures of model fit, which consist of a single value that must be compared against a cutoff or benchmark that has been established by statistics scholars. For example, a typical benchmark for the comparative fit index (CFI) is .90, meaning that values of .90 or greater indicate good fit, and values less than .90 indicate poor fit. The problem here is that these benchmarks are not all universally agreed upon. For the CFI, some scholars suggest a benchmark of .90 (e.g., Schumacker & Lomax, 2010), but others may suggest a stricter benchmark of .95 (e.g., Hu & Bentler, 1999).

Given the subjectivity of evaluating fit based on benchmarks, it may seem like the chi-square test should be the most objective and useful metric. However, this is not the case. In fact, the chi-square test may actually be the LEAST useful metric for model fit. The reason why the chi-square test is not very useful is because of its sensitivity to sample size. The larger the sample size, the greater the chances of obtaining a statistically significant chi-square. And given that most scholars agree that SEM should only be conducted with large sample sizes (usually meaning hundreds of participants), the chi-square test is all but guaranteed to be significant, even at higher significance cutoffs (e.g., .01 or .001). Because the chi-square test will be significant no matter what, it does not provide any useful information, and other measures of fit need to be considered.

As a final note, it is worth mentioning that the chi-square statistic itself (along with its degrees of freedom) can be a useful measure of model fit; it is just the significance test that ends up being useless. Some scholars recommend using the chi-square divided by the degrees of freedom (χ^{2}/*df*) as a measure of model fit, with values of 5 or less being a common benchmark.

**References**

Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. *Structural Equation Modeling, 6*, 1-55.

Schumacker, R. E., & Lomax, R. G. (2010). *A beginner’s guide to structural equation modeling* (3rd ed.). New York, NY: Routledge Academic.