Marginal evidence for psychological processes
Some comments on problems with investigating psychological processes using estimates of average (i.e. marginal) effects. Hence the play on words in the title.
Social psychology makes a lot of being theoretical. This generally means not just demonstrating an effect, but providing evidence about the psychological processes that produce it. Psychological processes are, it is agreed, intra-individual processes. To tell a story about a psychological process is to posit something going on “inside” people. It is quite reasonable that this is how social psychology should work — and it makes it consistent with much of cognitive psychology as well.
But the evidence that social psychology uses to support these theories about these intra-individual processes is largely evidence about effects of experimental conditions (or, worse, non-manipulated measures) averaged across many participants. That is, it is using estimates of marginal effects as evidence of conditional effects. This is intuitively problematic. Now, there is no problem when using experiments to study effects and processes that are homogenous in the population. But, of course, they aren’t: heterogeneity abounds. There is variation in how factors affect different people. This is why the causal inference literature has emphasized the differences among the average treatment effect, (average) treatment effect on the treated, local average treatment effect, etc.
Not only is this disconnect between marginal evidence and conditional theory trouble in the abstract, we know it has already produced many problems in the social psychology literature.1 Baron and Kenny (1986) is the most cited paper published in the Journal of Personality and Social Psychology, the leading journal in the field. It paints an rosy picture of what it is like to investigate psychological processes. The methods of analysis it proposes for investigating processes are almost ubiquitous in social psych.2 The trouble is that this approach is severely biased in the face of heterogeneity in the processes under study. This is usually described as problem of correlated error terms, omitted-variables bias, or adjusting for post-treatment variables. This is all true. But, in the most common uses, it is perhaps more natural to think of it as a problem of mixing up marginal (i.e. average) and conditional effects.3
What’s the solution? First, it is worth saying that average effects are worth investigating! Especially if you are evaluating a intervention or drug that might really be used — or if you are working at another level of analysis than psychology. But if psychological processes are your thing, you must do better.
Social psychologists sometimes do condition on individual characteristics, but often this is a measure of a single trait (e.g., need for cognition) that cannot plausibly exhaust all (or even much) of the heterogeneity in the effects under study. Without much larger studies, they cannot condition on more characteristics because of estimation problems (too many parameters for their N). So there is bound to be substantial heterogeneity.
Beyond this, I think social psychology could benefit from a lot more within-subjects experiments. Modern statistical computing (e.g., tools for fitting mixed-effects or multilevel models) makes it possible — even easy — to use such data to estimate effects of the manipulated factors for each participant. If they want to make credible claims about processes, then within-subjects designs — likely with many measurements of each person — are a good direction to more thoroughly explore.
- The situation is bad enough that I (and some colleagues) certainly don’t even take many results in social psych as more than providing a possibly interesting vocabulary. [↩]
- Luckily, my sense is that they are waning a bit, partially because of illustrations of the method’s bias. [↩]
- To translate to the terms used before, note that we want to condition on unobserved (latent) heterogeneity. If one doesn’t, then there is omitted variable bias. This can be done with models designed for this purpose, such as random effects models. [↩]