A key goal of research is identifying causal relationships, showing how an independent variable (cause) affects the dependent variable (effect). The three criteria for cause and effect—association, time ordering, and non-spuriousness—are familiar to most researchers. Research methods or statistics courses cover these concepts. While classic examples suggest establishing cause and effect is simple. It is often one of the most challenging aspects of real-world research design.
The first step in establishing causality is demonstrating association: is there a relationship between the independent and dependent variables? If both variables are numeric, you can examine their correlation to determine if a relationship exists. A common example is the relationship between education and income: individuals with more education typically earn higher incomes.
One can use cross tabulation, which cross-classifies the distributions of two categorical variables, to examine their association. For example, 60% of Protestants support the death penalty, while only 35% of Catholics do. This establishes an association between denomination and attitudes toward capital punishment. Researchers debate how closely variables must associate to make a causal claim. Generally, researchers focus more on statistical significance than the strength of the association.
Once you establish an association, you can focus on determining the time order of the variables of interest. For the independent variable to cause the dependent variable, it must occur first in time. In short, the cause must precede the effect. Time ordering is easy to ensure in an experimental design. Here, the researcher controls exposure to the treatment (the independent variable). The outcome of interest (the dependent variable) is then measured.
In cross-sectional designs, time ordering is harder to establish, especially if the relationship between variables could go both ways. For example, while education usually precedes income, individuals with higher incomes may have the money to return to school. Determining time ordering may require logic, existing research, and common sense when a controlled experimental design is not possible. Researchers must carefully specify the hypothesized direction of the relationship between variables and provide evidence. The should clarify it as either theoretical or empirical, to support their claim.
The third criterion for causality is also the most troublesome, as it requires that alternative explanations for the observed relationship between two variables be ruled out. This is termed non-spuriousness, which simply means “not false.” A spurious or false relationship exists when what appears to be an association between the two variables is actually caused by a third extraneous variable. Classic examples of spuriousness include the relationship between children’s shoe sizes and their academic knowledge: as shoe size increases so does knowledge, but of course both are also strongly related to age.
Another well-known example is the relationship between the number of fire fighters that respond to a fire and the amount of damage that results – clearly, the size of the fire determines both, so it is inaccurate to say that more fire fighters cause greater damage. Though these examples seem straightforward, researchers in the fields of psychology, education, and the social sciences often face much greater challenges in ruling out spurious relationships simply because there are so many other factors that might influence the relationship between two variables. Appropriate study design (using experimental procedures whenever possible), careful data collection and use of statistical controls, and triangulation of many data sources are all essential when seeking to establish non-spurious relationships between variables.