Anyone familiar with structural equation modeling (SEM) will know that there are a myriad of measures and indices that researchers may use to evaluate the fit of a model. Here, we will briefly highlight one particular assessment of model fit: the chi-square (χ2) test.
The chi-square test is unique among possible the measures of fit in SEM because it is a test of statistical significance. You can use the chi-square value and model degrees of freedom to calculate a p-value, which most SEM software does automatically. This tests the null hypothesis that the predicted model and observed data are equal. Because you want your predictions to match the actual data as closely as possible, you do not want to reject this null hypothesis. In other words, a nonsignificant result for this test indicates good model fit.
This differs from most other model fit measures, which consist of a single value compared against a cutoff or benchmark established by statistics scholars. For example, a typical benchmark for the comparative fit index (CFI) is .90, meaning that values of .90 or greater indicate good fit, and values less than .90 indicate poor fit. The problem is that universal agreement on these benchmarks has not reached. For the CFI, some scholars suggest a benchmark of .90 (e.g., Schumacker & Lomax, 2010), but others may suggest a stricter benchmark of .95 (e.g., Hu & Bentler, 1999).
Given the subjectivity of evaluating fit based on benchmarks, it may seem like the chi-square test should be the most objective and useful metric. However, this is not the case. In fact, many consider the chi-square test the least useful metric for model fit. The reason why the chi-square test is not very useful is because of its sensitivity to sample size. The larger the sample size, the greater the chances of obtaining a statistically significant chi-square. Since SEM requires large sample sizes, the chi-square test is almost always significant, even at stricter cutoffs (e.g., .01 or .001). Because the chi-square test will be significant no matter what, it does not provide any useful information, and other measures of fit need to be considered.
As a final note, it is worth mentioning that the chi-square statistic itself (along with its degrees of freedom) can be a useful measure of model fit; it is just the significance test that ends up being useless. Some scholars recommend using the chi-square divided by the degrees of freedom (χ2/df) as a measure of model fit, with values of 5 or less being a common benchmark.
References
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
Schumacker, R. E., & Lomax, R. G. (2010). A beginner’s guide to structural equation modeling (3rd ed.). New York, NY: Routledge Academic.