21 August 2007

Designing social inquiry 3 – Causality and Causal Inference

Now we have discussed summarizing historical detail and making descriptive inferences by dividing our data into systematic and nonsystematic components, we will move on to causal inference, just because the facts that we would have structured in the last chapter don’t speak for themselves – that is, if our goal is explanation rather than description. This doesn’t mean that we have to explain: ‘good description of important events is better than bad explanation of anything’ (p. 75).

The most important lesson we will draw of statistics in this chapter is that we can be bold in drawing causal inferences, as long as we provide with an honest estimate of the uncertainty of that inference.

First of all, we will define causality as a theoretical concept. Remember that we talked about data in terms of units, the elements to be observed in a study, dependent or outcome variables, that is, the value that changes in an inference as result of the manipulation of the explanatory or independent variables, which can be divided into key causal variables and control variables. ‘The key causal variable always takes on two or more values, … often denoted by ‘treatment group’ and ‘control group’. (p. 77)’ When determining the causal effect, we simply manipulate the treatment group in a counterfactual way, meaning to say: ‘imagine that things weren’t like this?’

An example. Imagine we want to see what the influence was of a certain tax policy (key causal variable) in the on buying behavior of the consumers in the Netherlands (dependent variable). How can we measure this influence? Since we cannot measure the same space-time niche with and without that tax policy, we would have to keep all (control) variables the same while manipulating, in a hypothetical replication, the key causal (treatment) variable, that is, the tax policy. Of course, we do not control for aspects of consumer behavior, since that is what we want to explain. Now the difference between the buying behavior of consumers with and without the tax policy is called the realized causal effect, even though it is of course a hypothetical construction.

Here we stumble upon what is known as the fundamental problem of causal inference, which consists of the fact that, in the current example, we can either observe the situation in which the tax policy is introduced or the situation in which it isn’t. We can never observe both so we can never with certainty state the influence of a causal effect.

But remember that not everything in social reality is determined by systematic components. At the moment of introducing the tax policy, a war might be breaking out; the Euro might fall against the dollar or something else might occur. In fact, these things do occur and at the moment of introducing the policy, other, nonsystematic events influence its effect. In order to calculate the effect of these varying elements at large (if we would rerun the introduction of the policy, we can be certain that the effect would differ slightly due to these nonsystematic components), we have to hypothetically replicate the random variable (that is, the behavior of consumers after introducing the policy and the behavior when the policy would not have been introduced) to get the random causal effect, which consists of the difference between those two.

The difference between the realized causal effect and the random causal effect is the number of hypothetical replications: for the realized causal effect we conduct just one single unobserved replication, and for the random causal effect we conduct a lot more. Thus, ‘the causal effect is the difference between the systematic component of observations made when the explanatory variable takes one value and the systematic component of comparable observations when the explanatory variable takes on another value’ (p. 81-82).

In order to distinguish between systematic and nonsystematic effects of the policy, we run the ‘experiment’ of introducing the policy many times. ‘The mean causal effect is then the average of the realized causal effects across replications of these experiments. Taking the average in this way causes the nonsystematic features of this problem to cancel out and leaves the mean causal effect to include only systematic features’ (p. 84).

There are two assumptions that might enable us to get around the fundamental problem, which both involve some untestable assumptions, which of course must be stated explicitly in any research.

  1. Unit Homogeneity. We know it to be impossible to observe Dutch consumer behavior twice, once introducing the tax policy and one time without. A commonly used solution (think of comparative case studies) is to rerun the experiment in two different units that are homogenous, which they are when ‘the expected values of the dependent variables from each unit are the same when our explanatory variable takes on a particular value. […] For a data set with n observations, unit homogeneity is the assumption that all units with the same value of the explanatory variables have the same expected value of the dependent variable’ (p. 91). The observations analyzed become identical in relevant respects for the purpose of analysis. A less strong version of this principle is called the constant effect assumption, which assumes that the causal effect is constant even when the units do not take on the same value; our tax policy could have a different effect in comparable countries but not introducing it would influence consumer behavior in a significantly similar way.
  2. Conditional Independence is the assumption that values are assigned to explanatory variables independently of the values taken by the dependent variables’ (p. 94). Assigning values is something possible in large-n experimental work, when some subjects are randomly assigned to the treatment group (that is, the explanatory variable is manipulated) while others get assigned to the control group (explanatory variable is left unmanipulated). The essential function hereof is that this guarantees that the values of the explanatory variables are not caused by the dependent variables; in this case, the homogeneity assumption is not required. ‘When random selection and assignment are infeasible […] we have to resort to some version of the unit homogeneity assumption in order to make valid causal inferences’ (p. 95).

We will now discuss some rules for constructing causal theories, without recurring to statistic terminology. Any theory includes a set of interrelated causal hypotheses, each of which specifies a certain relationship between variables that creates observable implications – if the explanatory variable takes on a certain value, the dependent variable takes on a predicted value. If the overall theory is not internally consistent, the hypotheses generated can be contradictive.

  1. Construct falsifiable theories. Theories should be designed so that it is easy to show that they are wrong. Theories are tentative, and we should ask ourselves what evidence would falsify them. It is not whether a theory is false that is the issue, but rather how much it can help us explain about the world. Any test of a theory must really be a test of one of its hypotheses; and even if that would give us a positive response about our theory, we would (according to Popper) just have verified a theory. One falsification would show, according to that same Popper, that a theory is wrong and should be dropped. The authors argue that if a theory is shown to be correct by a thousand tests, but false by one other test, we should not drop the theory; we should reformulate it more restrictively and evaluate it with a new data set. The falsification now accounted for has thus helped us define the borders of applicability of our theory – either in terms of particular conditions or settings.[1] But adapting a theory in this way often leads to less leverage; too many exceptions and a theory should be rejected. Leverage comes before parsimony, but if a slightly more complex theory explains more, we should of course give in.
  2. Build theories that are internally consistent. One way of devising internally consistent theories is with formal modeling, which basically consists of mathematically representing verbally posed theories. The thing is, that formal models only help to see if a theory is consistent; they abstract so much from actual conditions that they do not generate specific predictions about the real world. It might help to generate hypotheses, which would have to be tested. Formal models work to clarify our thinking and to develop internally consistent theories. That’s all.
  3. Select dependent variables carefully. First of all, dependent variables should be dependent. Do not take a variable to be dependent when it actually causes changes in our explanatory variables. Second, do not select observations based on the dependent variable so that it stays constant. That does not mean that if the dependent variable turns out to be constant and the causal effect is zero, we did something wrong: we simply had a causal effect of zero. Finally, choose a dependent variable that represents the variation we wish to explain – ‘we need the entire range of variation in the dependent variable to be a possible outcome of the experiment in order to obtain an unbiased estimate of the impact of the explanatory variables’ (p. 109). If not, we produced a selection bias (chapter 4).
  4. Maximize concreteness. Abstract concepts such as utility, culture, intentions are often used in social science and can play a useful role in theory formation, but form a hindrance to theory evaluation. When they are used in explanations, the risk is very high that the explanation is tautological or has no concretely observable implications. Now the risk arises that the discrepancy between the formulated theory and the observable implications is so big that it becomes too weak, so that the choice of a high level of abstraction must have a real justification in terms of the theoretical problem at hand. So even when we are conducting purely theoretical research, we should always wonder about the possible observable observations or research project that could be conducted on the based of our theory.
  5. State theories in as encompassing ways as feasible. This might sound contradictive, but if we would formulate a theory too precisely corresponding to reality, we are describing history. We need not provide evidence for all implications of our theory in order to state it, as long as we provide a reasonable estimate of the uncertainty that comes with it – especially the systematic features of the theory should be demarcated explicitly. The broader, the greater the leverage.

[1] A refutation should always encourage us to ask questions like: what variable is missing from our analysis which could produce a more generally applicable theory? But be careful: don’t throw overboard a theory at once, but neither should we extend or adapt it in such a way that It becomes implausible because of all the added exceptions and special cases.

No comments: