22 August 2007

Designing social inquiry 2 - Descriptive Inference

It is both possible to say something about particular events, using general principles as arguments, and to say something about a general principle using particular events as arguments. Interpretation is a style of research that does not diverge essentially from inference, as the method should be sound in both cases. Since social reality is inherently complex, we have to abstract key features of social reality from a mass of facts. This leads to one of the most difficult tasks: simplification. We always simplify: even the most extensive description is infinitely less complex than the reality it describes. In sum, we have to tell something about specific events as well as about general classes; the best way to understand a particular event is by, through inference, also to tell something about systematic patterns in similar parallel events.

When developing a theory, we should list as much observable implications as possible. We might not be able to collect data for all these implications. In selecting data therefore, our possible observations ‘are either implications of our theory or irrelevant’ (p. 48). Data need not be symmetric: we can have one case study besides a big data collection on voting behavior. It even adds to the liability of our theory if it holds in completely different situations.

If we make a model in qualitative research, we need to abstract the essential and correct features of reality. A restrictive model is abstract, clear and parsimonious, but less realistic; an unrestrictive model is detailed and contextual but also less clear. We will now formulate a formal model of data collection to introduce some terms used throughout this text. A data collection can consist of anything from an experiment to an interview, but one has always to report on how the data was created. Data is modeled with

  • Realized variables are an observation and labeled y; their value varies over the units; random variables Y vary across hypothetical replications of the same observation where unsystematic components are allowed to vary.
  • Units (n = number of units) are individual entities over which we analyze changing variables;
  • Observations are the values of the variables for each unit, and can be numerical, verbal, visual, etc..
  • A case is a ‘phenomenon for which we report and interpret only a single measure on any pertinent variable’ (p. 52).

A statistic is nothing more than an expression of data in abbreviated form. One statistic is, for example, the sample mean or average. The logic of statistic methodology can help us summarize historical detail. We should always focus our summary on the outcome that we wish to explain or describe, and a summary must always simplify the information we have.

Descriptive inference is, like we indicated, the process of understanding an unobserved phenomenon on the basis of a set of observations. A fundamental goal of inference is to distinguish between nonsystematic (idiosyncratic) and systematic (equal in a parallel case) components of the phenomenon we study – to which degree do our observations reflect typical, recurring, phenomena or outliers?

The goal of the inference is to learn about the systematic features of the random variables in any case; we want to see what is systematic about the random variables. There exist two visions on random variation. The probabilistic vision assumes that random, nonsystematic, variation can never be eliminated because the world is simply too complex. The deterministic vision assumes that it is possible to delete random variables, by designing a better analysis. In any case, only repeated tests in different contexts enable us to define a variable as systematic or nonsystematic. We can predict the effect of systematic factors while we can’t for nonsystematic factors. This does not mean, however, that systematic factors represent constants. The same real-world events can represent either a systematic or a nonsystematic factor: if bad weather always leads to fewer votes, it is systematic.

The following three criteria are commonly used in statistics.

  1. Unbiasedness occurs ‘when the variation from one replication of a measure to the next is nonsystematic and moves the estimate sometimes one way, sometimes the other’ (p. 63). A biased estimate is due to a systematic error that consistently shifts the estimate more in one way then the other. A theory can be biased – a bias does not just exist in the data set alone. A major source of bias in social science arises from the interests of the sources of raw information. Be careful with data from ‘biased’ functionaries as for example government officials.
  2. Efficiency provides a way to distinguish among unbiased estimators. Imagine you have two case studies, one unbiased but consisting of very few observations, and another one which is slightly biased but which consists of a large number (high n) of observations. If you calculate the variance of the estimators across hypothetical replications (technical terms), then the principle applies that the smaller the variance, the closer the estimate will be to the true parameter value, and the more efficient the estimator. It might be, that this applies for the large case study, which makes it more representative and reliable just because it consists of more observations. ‘All conditions being equal, our analysis shows that the more observations the better, because variability (and thus inefficiency) drops.
  3. In fact, the property of consistency is such that as the number of observations gets very large, the variability decreases to zero, and the estimate equals the parameter we are trying to estimate’ (p. 67).

It is also possible to combine insights from a small and unbiased case study on a comparable large-n study, by substituting an estimator of the large case study by a checked estimator of the small one. This basically means that a researcher has to make corrections in accordance with his knowledge of the origins of observations and context (the latter being represented in this text by the small case study).

No comments: