Monday, October 11, 2010

Polls and Gubernatorial Elections

Despite the fact that the American public (or at least the politically interested among us) are voracious consumers of publicly available polls, it is not uncommon to hear a refrain something along the lines of "you can't trust polls."  Yet, polls abound and play a prominent role in all manner of political prognostications.  What I've tried to do in my earlier posts on House and Senate elections is provide a concrete illustration of how polls are related to outcomes.  Hopefully, this helps make it clear why polls are so important to political forecasters.

Today, I turn to gubernatorial elections, which, perhaps not too surprisingly, follow a pattern very similar to senate polls.  In keeping with the earlier posts, I focus here on how well "out-of-sample" forecasts account for actual outcomes.  The logic is that you can't predict the 2010 outcomes based on the observed relationship between polls and outcomes in 2010 because you won't know what tht relationship is until after the election.  The solution is to use the observed relationship between polls and outcomes in other years to make predictions prior to the election. These are out-of-sample forecasts.

The figure below shows how well the poll-based out-of-sample predictions track with actual outcomes in forty-five gubernatorial elections from 2006 (34) and 2008 (11).  To generate the 2008 estimates, I regressed 2006 outcomes on 2006 polling averages over the last forty-five days of the campaign and used the estimates of that relationship (constant and slope) to predict 2008 outcomes using 2008 polling data.  I then used the relationship for the 2008 polls and outcomes, along with the polling data from 2006, to forecast the 2006 outcomes.  (Note: Data were generously provided by Pollster.com).
This is a very strong relationship and speaks to the potency of polls (or at least polling averages) as predictors of elections.   The correlation between the out-of-sample forecasts and the actual outcomes in 2006 and 2008 is .99 and the model predicted the wrong winner in just two of the cases.  Certainly,  many of the forecasts are very close to 50% and could easily end of calling the wrong winner, even if the point estimate is of by just a couple of percentage points.  But the pattern found here, in conjunction with that found earlier for Senate elections, suggests that candidates trailing significantly in the polls are not likely to win.

So, what does this augur for the 2010 gubernatorial election?  To answer this, I used estimates of the relationship between polls and outcomes in the 2006 and 2008 gubernatorial elections to generate a set of predictions for 2010.  A couple of things to note.  First, these estimates are based on polls taken during the forty-five days preceding the election.  This means that I don't have estimates for all states holding elections, though I assume I will before too long.  Second, these predictions are not final and will change as new data come in.  I'll try to update the predictions at least once a week.  Finally, I present point estimates only.  It goes without saying that predictions of close outcomes are far more likely than predictions of blowouts to call the wrong winner.  But, at the end of the day, you have to call it one way or the other and the point estimate is the best guess.

Here are the predictions:


Based on predictions for the states for which I have data, the model predicts ten seats switching from Democrat to Republican and five seats switching from Republican to Democrat, for a net Republican gain of five governorships.  The highlighted area simply indicates those states with the closest outcomes, where an error of just a couple of points could change the overall picture.  Two states that haven't yet had polls in the forty-five day pre-election window, Kansas and Tennessee, have earlier polls that also indicate a strong likelihood of a Democrat-to-Republican flip, making the picture even more bleak for the Democrats.