Tuesday, September 21, 2010

Followers of politics these days are inundated with poll numbers, mostly of the horse-race variety, routinely reported at a couple of my favorite sites, politicalwire.com and pollster.com, as well as an number of other places.  Although the national generic ballot is given some attention due to its predictive capacity, we know relatively little about how trial-heat polls in individual races can be used to predict district-level outcomes.  If we assume the vast share of congressional districts are easily predicted, based on incumbency and party strength, then trial-heat polls could be a valuable tool in predicting outcomes in the "toss-up" and "leaning" districts.

Dan Hopkins' post last week on (the lack of) partisan bias in pre-election polling gives us some sense of the overall accuracy of trial-heat polls, albeit in Senate and Gubernatorial races.  What I want to look at today is how well district-level trial-heat polls perform in U.S. House elections, with an eye toward eventually using them to predict outcomes in individual races.

Using data  provided by the folks at pollster.com, I've taken average poll readings over the last forty-five days of the campaign from 91 contests in 2006 and 89 contests in 2008.  The figure below shows the relationship between the averages and the eventual outcome.  In both years, there is a strong relationship, suggesting that poll leaders generally go on to win House elections.

In fact, in terms of the bottom line, across both years 85% of those candidates whose average poll share was greater than 50% went on to win their election.  

Do  pre-election House polls track better with outcomes as election day approaches?  Yes, but not always very much.

Certainly it is the case that in 2006 the polls became better predictors of final outcomes as election day approached.  In 2008, however, there wasn't a very steep increase in the predictive accuracy of polls as the election drew nearer.  More to the point, the overall correlation across the forty-five days prior to the election (first figure) is virtually identical to the correlation found for the last two-weeks of polling.  Also (though recognizing the cases are different due to restricting analysis to districts with polls in the last two weeks), only 77% of poll leaders (combining both years) in the last two weeks of the campaign went on to win their elections, compared to 85% using the forty-five day average.

Although these data illustrate a fairly obvious point--that candidates who lead in the polls tend to win-- it is comforting to see the strong connection between polls and outcomes, especially given that the data come from many different types of organizations, polling in a variety of different local contexts.  No doubt, taking some of those factors into account might shed light on conditions that make polls better or worse predictors of outcomes.  But that will have to wait for another day.


  1. So, you are back in business?

  2. I think a good early discussion for this new(ish) blog would address the relative merits of the data and methods used from Pollster that you report, relative to those used by, say, 538 or Polyvote. Basically, I'm interested in what added value this blog will have for a reader of 538, Gelman, and Monkey Cage.

    I look forward to future posts!

  3. Anon1: Yes, back in business(ish)

    Anon2: I like muck around in data, usually just to satisfy my own curiosity, and decided to share a few things once in a while in a public place. I hope that some of what I post will have added value for readers of other blogs, but I suspect this will be the case some times more than others. When it is, that's great; when it isn't, thanks for stopping by anyway, and make sure to check back in the future!