Path analysis is an extension of the regression model. In a path analysis model from the correlation matrix, two or more casual models are compared. The path of the model is shown by a square and an arrow, which shows the causation. Regression weight is predicated by the model. Then the goodness of fit statistic is calculated in order to see the fitting of the model.
Key concepts and terms:
Estimation method: Simple OLS and maximum likelihood methods are used to predict the path.
Path model: A diagram which shows the independent, intermediate, and dependent variables. A single-headed arrow shows the cause for the independent, intermediate and dependent variable. A double-headed arrow shows the covariance between the two variables.
Exogenous and endogenous variables: Those where no error points towards them, except the measurement error term. If exogenous variables are correlated to each other, then a double headed arrow will connect those variables. Endogenous variables may have both the incoming and outgoing arrows.
Path coefficient: A standardized regression coefficient (beta), showing the direct effect of an independent variable on a dependent variable in the path model.
Disturbance terms: The residual error terms are also called disturbance terms. Disturbance terms reflect the unexplained variance and measurement error.
Direct and indirect effect: The path model has two types of effects. The first is the direct effect, and the second is the indirect effect. When the exogenous variable has an arrow directed towards the dependent variable, then it is said to be the direct effect. When an exogenous variable has an effect on the dependent variable, through the other exogenous variable, then it is said to be an indirect effect. To see the total effect of the exogenous variable, we have to add the direct and indirect effect. One variable may not have a direct effect, but it may have an indirect effect as well.
Significance and goodness of fit: OLS and maximum likelihood methods are used to predict the path coefficient. Statistical software such as AMOS, M-Plus, SAS and LISREL, etc. are software that calculates the path coefficient and goodness of fit statistics automatically.
The following statistics are used to test the significance and goodness of fit:
Chi-square statistics: Non-significant chi-square value in path analysis shows the goodness of fit model. Sometimes, chi-square statistics is significant. However, we still have to test one absolute fit index and one incremental fit index.
Absolute fit index: RMSEA: An absolute fit index using 90% confidence interval for RMSEA should be less than 0.08 for a goodness of fit model.
Increment fit index: CFI, GFI, NNFI, TLI, RFI and AGFI are some incremental fit indexes, which should be greater than 0.90 for a goodness of fit model.
Modification indexes: Modification indexes (MI) may be used to add arrows to the model. The larger the MI, the more arrows will be added to the model, which will improve the model fit.
Linearity: Relationships should be linear.
Interval level data: Data should be dichotomous nominal, interval or ratio level of measurement.
Uncorrelated residual term: Error terms should not be correlated to any variable.
Disturbance terms: Disturbance terms should not be correlated to endogenous variables.
Multicollinearity: Low multicollinearity is assumed. Perfect multicollinearity may cause problems in the path analysis.
Identification: The path model should not be under identified, exactly identified or over identified models are good.
Adequate sample size: Kline (1998) recommends that the sample size should be 10 times (or ideally 20 times) as many cases as parameters, and at least 200.
*Click here for assistance with path analysis or other quantitative analyses.
Alwin, D. F., & Hauser, R. M. (1975). The decomposition of effects in path analysis. American Sociological Review, 40(1), 37-47.
Coffman, D. L., & MacCallum, R. C. (2005). Using parcels to convert path analysis models into latent variable models. Multivariate Behavioral Research, 40(2), 235-259.
Edwards, J. R., & Lambert, L. S. (2007). Methods for integrating moderation and mediation: A general analytical framework using moderated path analysis. Psychological Methods, 12(1), 1-22.
Everitt, B. S., & Dunn, G. (1991). Applied multivariate data analysis. London: Edward Arnold. View
Heise, D. R. (1975). Causal analysis. New York: John Wiley & Sons.
Kano, Y. (2002). Does structural equation modeling outperform traditional factor analysis, analysis of variance and path analysis? Japanese Journal of Behaviormetrics, 29(2), 138-159.
Kline, R. B. (1991). Latent variable path analysis in clinical research: A beginner’s tour guide. Journal of Clinical Psychology, 47(4), 471-484.
Loehlin, J. C. (1986). Latent variable models: An introduction to factor, path and structural analysis. Hillsdale, NJ: Lawrence Erlbaum Associates. View
McDonald, R. P. (1996). Path analysis with composite variables. Multivariate Behavioral Research, 31(2), 239-270.
Pedhazur, E. J. (1982). Multiple regression in behavioral research: Explanation and prediction (2nd ed.). New York: Holt, Rinehart, & Winston.
Roehrig, S. F. (1996). Probabilistic inference and path analysis. Decision Support Systems, 16(1), 55-66.
Schumacker, R. E., & Lomax, R. G. (2004). A beginner’s guide to structural equation modeling (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. View
Stage, F. K., Carter, H. C., & Nora, A. (2004). Path analysis: An introduction and analysis of a decade of research. Journal of Educational Research, 98(1), 5-12.