21 Apr 2011
Psychologists have posited numerous psychological traits and described causal roles they ought to play in determining human behavior. Most often, the canonical measure of a trait is a questionnaire. Investigators obtain this measure for some people and analyze how their scores predict some outcomes of interest. For example, many people have been interested in how psychological traits affect persuasion processes. Traits like need for cognition (NFC) have been posited and questionnaire items developed to measure them. Among other things, NFC affects how people respond to messages with arguments for varying quality.
How useful are these traits for explanation, prediction, and adaptive interaction? I can’t address all of this here, but I want to sketch an argument for their irrelevance to adaptive interaction — and then offer a tentative rejoinder.
Interactive technologies can tailor their messages to the tastes and susceptibilities of the people interacting with and through them. It might seem that these traits should figure in the statistical models used to make these adaptive selections. After all, some of the possible messages fit for, e.g., coaching a person to meet their exercise goals are more likely to be effective for low NFC people than high NFC people, and vice versa. However, the standard questionnaire measures of NFC cannot often be obtained for most users — certainly not in commerce settings, and even people signing up for a mobile coaching service likely don’t want to answer pages of questions. On the other hand, some Internet and mobile services have other abundant data available about their users, which could perhaps be used to construct an alternative measure of these traits. The trait-based-adaptation recipe is:
- obtain the questionnaire measure of the trait for a sample,
- predict this measure with data available for many individuals (e.g., log data),
- use this model to construct a measure for out-of-sample individuals.
This new measure could then be used to personalize the interactive experience based on this trait, such that if a version performs well (or poorly) for people with a particular score on the trait, then use (or don’t use) that version for people with similar scores.
But why involve the trait at all? Why not just personalize the interactive experience based on the responses of similar others? Since the new measure of the trait is just based on the available behavioral, demographic, and other logged data, one could simply predict responses based on those measure. Put in geometric terms, if the goal is to project the effects of different message onto available log data, why should one project the questionnaire measure of the trait onto the available log data and then project the effects onto this projection? This seems especially unappealing if one doesn’t fully trust the questionnaire measure to be accurate or one can’t be sure about which the set of all the traits that make a (substantial) difference.
I find this argument quite intuitively appealing, and it seems to resonate with others.1 But I think there are some reasons the recipe above could still be appealing.
One way to think about this recipe is as dimensionality reduction guided by theory about psychological traits. Available log data can often be used to construct countless predictors (or “features”, as the machine learning people call them). So one can very quickly get into a situation where the effective number of parameters for a full model predicting the effects of different messages is very large and will make for poor predictions. Nothing — no, not penalized regression, not even a support vector machine — makes this problem go away. Instead, one has to rely on the domain knowledge of the person constructing the predictors (i.e., doing the “feature engineering”) to pick some good ones.
So the tentative rejoinder is this: established psychological traits might often make good dimensions to predict effects of different version of a message, intervention, or experience with. And they may “come with” suggestions about what kinds of log data might serve as measures of them. They would be expected to be reusable across settings. Thus, I think this recipe is nonetheless deserves serious attention.
- I owe some clarity on this to some conversations with Mike Nowak, Maurits Kaptein, and others. [↩]