By Craig Kolb, Acentric Marketing Research (Pty) LTD, 9 November 2016
I tweeted on the 18th of October – 18 days prior to the election – that I expected Donald Trump to win the US presidential election, in contrast to what most polls and commentators were predicting at the time. This was based on my analysis of the methodologies being used by the various polls. One poll in particular stood out. This poll had a mechanism for accommodating two sources of bias, which were present in this election, but that didn’t exist to the same extent in prior elections. These two sources of bias were social desirability bias and propensity bias.
Social desirability bias
In the 2012 election (Barack Obama vs Mitt Romney) both candidates were fairly palatable, as far as the media was concerned; the sheer intensity of media dislike was not there to tarnish one or the other candidate. In fact, going back through a few decades worth of elections, none seemed as controversial as the 2016 presidential election.
The majority of the media has been vehemently opposed to Donald Trump over the last year and a half; the practical consequence being that a significant chunk of the voting population would feel it was not socially desirable to reveal their voting intentions for a candidate the whole world seemed to oppose. Whether or not the whole world really was opposed was irrelevant, the media is an extremely powerful vehicle for generating false consensus. Only very independently minded individuals would be able to see past this – and more importantly when it comes to polling – be prepared to state their views publicly.
During a political poll (survey), there are a number of ways this need for social acceptance can affect the results. The most important aspect is the survey ‘mode’ – how the interviews are conducted. The most commonly used modes in the United States are telephonic polls (using a human interviewer), IVR polls (automated telephonic interviews) and online-panel surveys. I won’t bother discussing the unscientific polls on news websites.
Any of the polling modes that involve human interviewers are obviously going to pose a problem when dealing with a candidate that is not considered socially acceptable. It introduces what is called a ‘social desirability bias’; a tendency, no matter the size or quality of the sample for the result to skew away from its true value due social pressures. This means many of the polling firms were wasting millions of dollars on a polling method with an inherent bias.
This leaves those polls that either used IVR (also called robo polls) or those that use online surveys (or in rare instances mail). These include Pulse Opinion Research (Rasmussen), Survey USA and Gravis Marketing amongst the commercial poll providers; all of whom must strive to provide the most cost-effective method for achieving accurate results. Various academic outfits also sometimes use this approach; most notably the USC Daybreak poll – a vastly more expensive methodology that has a number of other unique aspects that I will discuss below. Those polling firms that rely solely on IVR face a major ‘coverage bias’ issue, since according to US law, they may only poll landline phones. Cell phones are off limits, with the exception of voice messages dropped in the back-end.
As a result many take a ‘mixed mode’ approach of using IVR and online-panels to conduct interviews.
Examining the various polls I realised that another source of bias – especially critical in this election given the enthusiasm gap – was being ignored. This related to voting propensity. It was clear that a massive enthusiasm gap between Trump and Clinton supporters would lead different turnout rates.
Most polls ask one key question about who the voter is likely to vote for and use adjustments for turnout based, at least in part on past behaviours. There is no explicit attempt to quantify likely voter turnout, based on the current state of mind. So this means that a candidate who is ahead may still lose, because less of their supporters turn out on voting day.
The only poll that seems to have taken propensity into account – directly from the current state of mind of the poll respondent – is the USC Daybreak poll.
While voting-likelihood-prediction methods have existed since at least the 1950s – a notable example being the Perry-Gallup index – they are likely to be thrown when a significant shift occurs in variable relationships. Deterministic or algorithmic models, such as Perry-Gallup rely on accurate estimates of voter turnout for a cut-off value at the aggregate level, while models fit to individual past-behaviours can fail to predict for a variety of reasons – such as a significant breaks with the past in terms of the relationships between variables.
While sophisticated decision-tree models can accommodate large networks of predictors and their interactions, making them more accurate in stable conditions, they are likely to fall apart when society is undergoing radical shifts; in which case simpler models would probably be advisable.
USC Daybreak Poll
The USC Daybreak poll is run by the University of Southern California and the LA Times. In the weeks leading up to election day on the 9th of November the USC Daybreak poll, more often than not, showed Donald Trump with a strong lead.
While past voting behaviour was included in the respondent-weighting procedure along with demographics, the methodology seems to use a fairly simple approach to account for current state of mind by asking a question about the likelihood of voting, which is then used to weight the choice selection. I believe this simple approach to taking into account the current voting likelihood of each respondent is likely to prove more accurate, under unstable social conditions, when more sophisticated models may ‘break’. Furthermore, instead of asking for one choice, the respondent is asked to indicate a probability of voting for each candidate. This allows for uncertainty in choice probabilities to be accounted for at the individual level.
The use of an online survey (after an initial recruitment process by mail) allowed for self-completion, reducing social desirability bias.
Additional aspects of the USC Daybreak poll that differentiated this poll include:
- The use of a Address Based Sampling (ABS), a probability-sampling method that randomly selects households who are then contacted by conventional mail.
- Fairly generous incentives to increase response rates. With telephone response rates now as low as 5% to 10%, I’d hazard a guess that the USC Daybreak poll had better response rates, and as a result less non-response bias. Since telephonic polls non-response weights can’t correct for every unknown, the USC Daybreak likely had a significant advantage here.
- A panel, rather than a cross-sectional survey, that was tracked over time leading to a reduction in shifts in results over time due to sampling error (although not entirely avoidable as drop outs must be replaced over time).
- A large starting sample of over 3,000 that was added to as the election progressed.
All-in-all the USC Daybreak survey addressed the various sources of error and bias more effectively than the other polls and so was able to correctly predict the outcome of this election (assuming the objective was the prediction of the Electoral College outcome – see more below). The only other national poll that led to a prediction in favour of Trump, was the IBD Tipp poll. On the other hand, if you assume the objective is to predict the popular vote, then the Rasmussen poll was the most accurate.
Just to give you an idea of the vast number of issues researchers have to contend with – Figure 1 shows a more complete view of the various sources of error and bias.
Figure 1: Sources of survey error and bias
The Electoral College – too costly to predict directly
If you’ve read this far, you deserve an additional layer of insight (or confusion depending on how you look at it). The USA has a geographical-weighting system – labelled the ‘Electoral College’ – that prevents larger states from pushing smaller ones around. Ultimately it is the Electorial College and not the popular vote that determines who wins the presidency.
It works like this. The president is elected based on the votes of electors in each state. The winner of the popular vote in a state, gets the votes of all the electors of that state. The number of electors depends on the population size, in large part, but to ensure smaller states are not marginalised, a small constant of two electors (to mirror the number of Senators) is added to each state. The effect is a mild dampening of the larger states’ population advantage. So for instance, a large state like California has 55 electors calculated by adding 2 electors for the Senate and 53 electors for the House of Representatives (both houses are part of Congress). A small state like North Dakota has 3 electors (2 for the Senate and only 1 for the House of Representatives).
Since the weighting effect is mild, the end result is that most presidential election results tend to reflect the popular vote.
What does this have to do with polls though? National polls in the USA only attempt to predict the national share of votes (popular vote), rather than the final Electoral College outcome, since this is far less costly. This twist seems lost on the media, as it is not soundbite friendly; and the polls get away with this shortcut most of the time, so there is little incentive for attempting to predict the electoral college outcome directly.
What would it take to predict the Electoral College outcome directly?
If polls attempted to predict the Electoral College outcome directly, it would mean having vastly larger samples in each state; so that the outcome of each state could be predicted individually. To clarify this, the outcome of each states election is a winner-takes-all situation. Donald Trump won North Dakota and he received all 3 elector votes for that state, they are not split proportional to the number of votes for each candidate (this is how it works in most states, with rare exceptions).
To illustrate further. Instead of having say a sample of 1,000 interviews nationally, you would need something closer to 50,000 nationally (there are 50 states). This is because small samples don’t get more accurate just because the population size you are measuring has shrunk (unless the sampling fraction exceeds 5% of the popuation). Since even the smallest states have hundreds-of-thousands of residents, sampling fractions will be well below 5%.
Put another way, you need just as much sample to predict the outcome of a state election result as you would a national result. The implication being that the cost of a poll capable of predicting the Electoral College outcome directly would be astronomical.