- Background
- Presidential
- State
- State Win Probabilities
- State win vote-shares
- Senate
- Senate Win Probabilities
- Senate win vote-shares
- Log scores of win predictions
- Summary tables
- RMSEs
- Brier scores
- Log scores
- See also
- External links
Statistically analyzing in
R
hundreds of predictions compiled for ~10 forecasters of the 2012 American Presidential election, and ranking them by Brier, RMSE, & log scores; the best overall performance seems to be by Drew Linzer and Wang & Holbrook, while Nate Silver appears as somewhat over-rated and the famous Intrade prediction market turning in a disappointing overall performance.
In November 2012, I was hired by CFAR to compile an extensive dataset of pundits, modelers, hobbyists, and academics who had attempted to statistically forecast the 2012 American presidential race and other minor races; the results were interesting in that they contradicted the lionization of Nate Silver’s forecasts in The New York Times. This page is a full listing of the R source code I used to produce my analysis for the CFAR essay; notes on the derivation of each dataset are stored at 2012-election-gwern-notes.txt
.
The essay itself lives at Was Nate Silver the Most Accurate 2012 Election Pundit?
.
Background
This election prediction judgment divided up into several sections dealing with different categories of predictions:
- the overall Presidential race predictions: probability of Obama victory, final electoral vote count, and percentage of popular vote
- the Presidential state-by-state predictions: the percentage Obama will take (vote share/margin/edge), as well as the probability he will win that state at all
- the Senate state-by-state predictions: similar, but normalized for the Democratic candidate
Few forecasters made predictions in all categories, the ones who did make predictions did not always make their full predictions public, etc. Note that all percentages are normalized in terms of that going to Obama, Democrats, or in some cases, Independents/Greens. The Reality
forecaster
is the ground truth; these were all updated 23 November in what is hopefully a final update.
The point of these calculations is to extract Brier scores (for categorical predictions like percentage of Obama victory) and RMSE sums (for continuous/quantitative predictions like vote share). Intrade prices were interpreted as straightforward probabilities without any correction for Intrade’s long-shot bias1
Presidential
presidential <- read.csv("www.gwern.net/docs/2012-election-presidential.csv", row.names=1)
# Reality=2012 result; 2008=2008 results
presidential
probability electoral popular
Reality 1.0000 332 50.79
2008 1.0000 365 53.00
Nate Silver 0.9090 313 50.80
Drew Linzer 0.9900 332 NA
Simon Jackman 0.9140 332 50.80
DeSart 0.8862 303 51.37
Margin of Error 0.6800 303 51.50
Wang & Ferguson 1.0000 303 51.10
Intrade 0.6580 291 50.75
Josh Putnam NA 332 NA
Unskewed Polls NA 263 48.88
# probability can be scored as a Brier score; available in 'verification' library
install.packages("verification")
library(verification)
# handle lists & vectors for later
br <- function(obs, pred) brier(unlist(obs),
unlist(pred),
bins=FALSE)$bs # bins=FALSE avoids rounding
# convenience function
brp <- function(p) brier(presidential["Reality",]$probability,
presidential[p,]$probability,
bins=FALSE)$bs
lapply(rownames(presidential)[1:9], brp)
- Reality: 0
- 2008: 0
- Wang: 0
- Linzer: 0.0001
- Jackman: 0.007396
- Silver: 0.008281
- DeSart: 0.01295044
- Margin: 0.1024
- Intrade: 0.116964
- Random: 0.25 (50% guess is always 0.25)
# To score electorals and populars, we use RMSE
rmse <- function(obs, pred) sqrt(mean((obs-pred)^2,na.rm=TRUE))
rpe <- function(p) rmse(presidential["Reality",]$electoral, presidential[p,]$electoral)
lapply(rownames(presidential), rpe)
- Reality: 0
- Linzer: 0
- Jackman: 0
- Putnam: 0
- Silver: 19
- DeSart: 29
- Margin: 29
- Wang: 29
- 2008: 33
- Intrade: 41
- Unskewed: 69
rpp <- function(p) rmse(presidential["Reality",]$popular, presidential[p,]$popular)
lapply(rownames(presidential)[c(1:9,11)], rpp)
- Reality: 0
- Wang: 0.31
- DeSart: 0.58
- Jackman: 0.01
- Silver: 0.01
- Intrade: 0.04
- Margin: 0.71
- 2008: 2.21
- Unskewed: 1.91
State
State Win Probabilities
# Reality=final 2012 result - 0 for Romney states, 100 for Obama
# 2008=2008 state results (=Reality, negated for Obama loss of Indiana & North Carolina)
statewin <- read.csv("www.gwern.net/docs/2012-election-statewin.csv", row.names=1)
statewin
al ak az ar ca co ct de
Reality 0.0000 0.000000 0.0000 0.0000 1.0000 1.00000 1.0000 1.00000
2008 0.0000 0.000000 0.0000 0.0000 1.0000 1.00000 1.0000 1.00000
Nate Silver 0.0000 0.000000 0.0200 0.0000 1.0000 0.80000 1.0000 1.00000
Drew Linzer 0.0000 0.000086 0.0000 0.0000 1.0000 0.98333 1.0000 0.98333
Margin of Error 0.0269 0.099800 0.4388 0.0451 0.9443 0.64710 0.9125 0.93770
Intrade 0.0000 0.000000 0.0600 0.0000 0.9500 0.55600 0.9900 0.96000
DeSart 0.0000 0.090000 0.0390 0.0000 1.0000 0.52300 0.9990 1.00000
Simon Jackman 0.0052 0.000000 0.0050 0.0000 1.0000 0.76520 1.0000 1.00000
Wang & Ferguson 0.0000 0.000000 0.0000 0.0000 1.0000 0.84000 1.0000 1.00000
Josh Putnam NA NA NA NA NA NA NA NA
Unskewed Polls NA NA NA NA NA NA NA NA
dc fl ga hi id il indiana ia ks
Reality 1.000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.00000
2008 1.000 1.0000 0.0000 1.0000 0.0000 1.0000 1.0000 1.0000 0.00000
Nate Silver 1.000 0.5000 0.0000 1.0000 0.0000 1.0000 0.0000 0.8400 0.00000
Drew Linzer NA 0.6040 0.0000 1.0000 0.0000 1.0000 0.0000 0.9966 0.03866
Margin of Error 1.000 0.4575 0.1972 0.9987 0.0086 0.9569 0.3273 0.6467 0.09710
Intrade 0.975 0.3300 0.0300 0.9750 0.0000 0.9890 0.0200 0.6630 0.00000
DeSart 1.000 0.4910 0.0300 1.0000 0.0000 1.0000 0.0020 0.7700 0.00000
Simon Jackman 1.000 0.5216 0.0014 1.0000 0.0000 1.0000 0.0000 0.8376 0.00000
Wang & Ferguson 1.000 0.5000 0.0000 1.0000 0.0000 1.0000 0.0000 0.8400 0.00000
Josh Putnam NA NA NA NA NA NA NA NA NA
Unskewed Polls NA NA NA NA NA NA NA NA NA
ky la me md ma mi mn ms
Reality 0.000000 0.0000 1.0000 1.00000 1.0000 1.0000 1.0000 0.00000000
2008 0.000000 0.0000 1.0000 1.00000 1.0000 1.0000 1.0000 0.00000000
Nate Silver 0.000000 0.0000 1.0000 1.00000 1.0000 0.9900 1.0000 0.00000000
Drew Linzer 0.000000 0.0000 1.0000 1.00000 1.0000 1.0000 1.0000 0.07866667
Margin of Error 0.019100 0.0442 0.8403 0.96837 0.8988 0.6837 0.7149 0.13470000
Intrade 0.000000 0.0000 0.9300 0.94000 0.9950 0.8840 0.8490 0.00000000
DeSart 0.000000 0.0000 0.9930 1.00000 1.0000 0.9350 0.9610 0.00000000
Simon Jackman 0.000004 0.0000 1.0000 1.00000 1.0000 0.9998 0.9992 0.00000000
Wang & Ferguson 0.000000 0.0200 1.0000 1.00000 1.0000 1.0000 1.0000 0.00000000
Josh Putnam NA NA NA NA NA NA NA NA
Unskewed Polls NA NA NA NA NA NA NA NA
mo mt ne nv nh nj nm ny
Reality 0.000000 0.0000 0.0000 1.0000000 1.0000 1.0000 1.0000 1.0000
2008 0.000000 0.0000 0.0000 1.0000000 1.0000 1.0000 1.0000 1.0000
Nate Silver 0.000000 0.0200 0.0000 0.9300000 0.8500 1.0000 0.9900 1.0000
Drew Linzer 0.000000 0.0000 0.0000 0.9993333 0.9980 1.0000 1.0000 1.0000
Margin of Error 0.447300 0.2436 0.0562 0.7710000 0.6886 0.8647 0.8579 0.9697
Intrade 0.050000 0.0500 0.0000 0.8370000 0.6490 0.9790 0.9390 0.9500
DeSart 0.052000 0.0080 0.0000 0.7680000 0.7560 0.9980 0.9740 1.0000
Simon Jackman 0.000004 0.0032 0.0000 0.9120000 0.8324 0.9998 0.9968 1.0000
Wang & Ferguson 0.000000 0.0000 0.0000 0.9900000 0.8400 1.0000 1.0000 1.0000
Josh Putnam NA NA NA NA NA NA NA NA
Unskewed Polls NA NA NA NA NA NA NA NA
nc nd oh ok or pa ri
Reality 0.00000000 0.0000 1.0000000 0.0000 1.0000000 1.0000 1.0000
2008 1.00000000 0.0000 1.0000000 0.0000 1.0000000 1.0000 1.0000
Nate Silver 0.26000000 0.0000 0.9100000 0.0000 1.0000000 0.9900 1.0000
Drew Linzer 0.08533333 0.0000 0.9986667 0.0000 0.9986667 1.0000 1.0000
Margin of Error 0.50030000 0.1284 0.6038000 0.0029 0.7886000 0.7562 0.9684
Intrade 0.23000000 0.0030 0.6550000 0.0010 0.9590000 0.8200 0.9500
DeSart 0.06600000 0.0000 0.7040000 0.0000 0.9430000 0.8810 1.0000
Simon Jackman 0.28120000 0.0000 0.9298000 0.0000 0.9726000 0.9910 1.0000
Wang & Ferguson 0.16000000 0.0000 0.9300000 0.0000 1.0000000 0.9300 1.0000
Josh Putnam NA NA NA NA NA NA NA
Unskewed Polls NA NA NA NA NA NA NA
sc sd tn tx ut vt va wa
Reality 0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 1.0000 1.0000
2008 0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 1.0000 1.0000
Nate Silver 0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 0.7900 1.0000
Drew Linzer 0.1386667 0.0000 0.000000 0.0000 0.0000 1.0000 0.9760 1.0000
Margin of Error 0.1345000 0.1665 0.053100 0.0545 0.0035 0.9846 0.5046 0.8473
Intrade 0.0400000 0.0500 0.020000 0.0200 0.0450 0.9800 0.5800 0.9750
DeSart 0.0030000 0.0010 0.000000 0.0000 0.0000 1.0000 1.0000 0.9980
Simon Jackman 0.1290000 0.0068 0.000004 0.0000 0.0000 1.0000 0.7840 1.0000
Wang & Ferguson 0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 0.8400 1.0000
Josh Putnam NA NA NA NA NA NA NA NA
Unskewed Polls NA NA NA NA NA NA NA NA
wv wi wy
Reality 0.000000000 1.0000 0.000000000
2008 0.000000000 1.0000 0.000000000
Nate Silver 0.000000000 0.9700 0.000000000
Drew Linzer 0.001333333 1.0000 0.000666667
Margin of Error 0.042700000 0.6448 0.006900000
Intrade 0.020000000 0.7460 0.000000000
DeSart 0.000000000 0.8560 0.000000000
Simon Jackman 0.005400000 0.9698 0.000000000
Wang & Ferguson 0.000000000 0.9900 0.000000000
Josh Putnam NA NA NA
Unskewed Polls NA NA NA
brstate <- function(p) br(statewin["Reality",], statewin[p,])
lapply(rownames(statewin)[1:9], brstate)
- Reality: 0
- Drew Linzer: 0.00384326
- Wang/Ferguson: 0.007615686
- Nate Silver: 0.00911372
- Simon Jackman: 0.00971369
- DeSart/Holbrook: 0.01605542
- Intrade: 0.02811906
- 2008: 0.03921569
- Margin of Error: 0.05075311
- random (50%) guesser 0.25000000