Methods & Results Statistical Expertise Statistical Method Sample Size Power Analysis
home statistical methods methods and results chapters statistical expertise sample size / power analysis mission and testimonials free consultation

Path Analysis

Overview. Path analysis was developed by Sewall Wright as a method for studying the direct and indirect effects of variables hypothesized as causes of variables treated as effects (Wright, 1921, 1934) “It is not a method for discovering causes, but a method applied to a causal model already formulated on the basis of knowledge and theoretical considerations.” (Pedhazur, 1982, p. 580). Wright stated:

    ...the method of path coefficients is not intended to accomplish the impossible task of deducing causal relations from the values of the correlation coefficients. It is intended to combine the quantitative information given by the correlations with such qualitative information as may be at hand on causal relations to give a quantitative interpretation (Wright, 1934, p. 193).

More than 40 years passed before path analysis was discovered as a tool for social sciences research (Klem, 2003). Blalock and Duncan, two sociologists, utilized this technique in their 1971 publication, Causal Models in the Social Sciences. The use of the technique increased during the 1970’s following the development of computer programs to perform covariance analysis (Ibid).

A path diagram and a corresponding path model describe a set of equations summarizing complex scientific ideas in terms of statistical relationships. In the following sections, the assumptions that underlie the application of path analysis will be discussed, path diagrams will be introduced, path coefficients calculated and correlations decomposed through examples.

Assumptions. The method of path analysis allows for the simultaneous solution of many multiple linear regression analyses. The assumptions of normality and multicollinearity apply to multiple linear regression and to path analysis. In other words, it is assumed that the residuals (predicted minus observed values) are distributed normally and predictors are not redundant.

  • The relationships among the variables in the model are linear, additive, and causal. Curvilinear, multiplicative and interaction relations are excluded.
  • The disturbance term, or residual, associated with a specific variable is not correlated with the variables that precede it in the model.
  • There is a one-way causal flow in the model.
  • The variables are measured on an interval scale.
  • The variables are measured without error.

Path Diagrams. Path analysis is best explained through the use of path diagrams. Consider the following diagram:



In this diagram,
  • Variables 1 and 2 are exogenous variables. Exogenous variables are those variables whose causes are not explicitly represented in the model. Exogenous variables are causally prior to all dependent variables in the model. The correlation between these variables is r12.
  • Variables 3 and 4 are endogenous variables. The causes of endogenous variables are specified in the model.
  • U3 and u4 are disturbances.
  • The one way arrows represent the direct causal effects in the model, also known as the structural effects. In this model, variable 2 has a direct effect on variable 4 as well as an indirect effect through variable 3.
  • Path coefficients are represented by p12, p24, p23, and p34 in the model.

The Calculation of Path Coefficients. In this model, the equations are
    1=e1
    2= r12*1 + e2
    3=p23*2 +e3
    4=p34*3 + p24*2 +p12*1 + e4

The Decomposition of Correlations. Within a causal model it is possible to decompose the correlation between an exogenous and an endogenous variable, or between two endogenous variables, into different components. For instance, p24 indicates the direct effect of 2 on 4, and r12p12=r24-p24 is the unanalyzed component because it is due to correlated causes. A correlation coefficient may be decomposed into the the following: (1) Direct Effect (DE); (2) Indirect Effects (IE); (3) Unanalyzed (U) due to correlated causes; and (4) Spurious (S) due to common causes. The sum of DI and IE is the total effect, or the effect coefficient (Pedhazur, 1982).

References

Klem, L. (2003). Path Analysis In L.G. Grimm and P.R. Yarnold (Eds.) Reading and understanding multivariate statistics. Washington, DC: The American Psychological Association. 65-96.

Pedhazur, E.J. (1982). Multiple regression in behavioral research: Explanation and prediction. (2nd edition). Fort Worth, TX: Holt, Rinehart and Winston, Inc.

Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 20, 557-585.

Wright S. (1934). The method of path coefficients. Annals of Mathematical Statistics, 5, 161-215.

Wright, S. (1960). Path coefficients and path regressions: Alternative or complementary concepts? Biometrics, 16, 189-202. (a)

Wright, S. (1960). The treatment of reciprocal interaction, with and without lag, in path analysis. Biometrics, 16, 423-445. (b)