2016 will be a year of elections—culminating with the selection of our next president, and beginning with something even more important: the Baseball Hall of Fame. On January 6 at 6pm Eastern, we'll learn which players join the pantheon of the game's best—but that doesn't mean it has to be a surprise. A healthy proportion of the Hall of Fame election's 450 votes or so are known in advance, thanks to the dogged work of Ryan Thibodaux. Ryan's addicting BBHOF Tracker aggregates ballots as they are published by their casters: the media members of the BBWAA. By Election Day, the Tracker has historically sampled as much as a third of the electorate—in essence, functioning as an "exit poll" for the referendum.
But we must be careful not to take this poll as gospel. Even the best polls carry margins of error, and Ryan will be the first to tell you that his spreadsheet represents merely data—not a projection. That's where I come in.
For the past several years, I've reweighted raw Hall of Fame data like the BBHOF Tracker's to arrive at scientific projections of the final results. Last year, my projections proved twice as accurate as the raw polls and even outperformed other prominent Hall of Fame forecasters, including one of my heroes, Tom Tango. My methodology simulates the work of political pollsters, who start by surveying the electorate but must weight these raw results using demographics, vote history, and other factors to get final, maximally accurate numbers. (This pollster "skewing" was what many Republicans, including the creator of UnskewedPolls.com, complained about in 2012 when raw data showed Romney ahead but smoothing the data gave Obama the lead—and pollsters' methods were validated when Obama, of course, won by the predicted margin.)
In the case of the Hall of Fame, a certain type of voter is more likely to "respond" to the exit poll by revealing their ballot for input in the Tracker. This median public voter is more stat-savvy, cares less about steroid use, and uses up more spots on the ballot. Editorializing a bit here, they also tend to more carefully and fairly consider their votes—a requirement that goes hand in hand with their obvious belief in the transparency of the process. Meanwhile, voters who choose to stay private tend to base their decisions on narrative, prefer stats like saves and RBI, and are more likely to invoke the character clause against PED users. They are more at peace with voting for just a few players using their own personal, subjective standards—often in patterns that seem random to the rest of us.
To account for the exit poll's oversampling of so-called "progressive" BBWAA members, we have to adjust the raw data for each player up or down. Comparing past exit polls with the final results, it's plain that candidates such as Tim Raines, Barry Bonds, and Curt Schilling are consistently overstated by the polls, while players like Lee Smith, Larry Walker, and Nomar Garciaparra are lowballed by them. I use these historical deviations to come up with an exit-poll "adjustment factor" for each player returning to the ballot (for first-time candidates, calculating the adjustment factor is more complicated—see below). Then, I simply add or subtract each player's adjustment factor to or from his percentage in the BBHOF Tracker to arrive at my projections.
Below are the current projections along with their underlying numbers. To reflect the new polling data that Ryan collects on a rolling basis, I'll update these projections daily on Twitter and in this Google spreadsheet until the results are announced. These projections—and the polls—get more accurate for each additional ballot released. UPDATE: These are my final projections, issued at 5:55pm on January 6.
For those of you who are interested in the exact methodology of these projections, read on. To calculate the adjustment factor for returning players, I took a straight average of each player's differential between public and private ballots over the last three years. For example, Mike Piazza's differential was −9.8% in 2015, −9.1% in 2014, and −3.8% in 2013, so his adjustment factor is −7.6%. Importantly for drawing the distinction between public and private ballots, in order to best simulate the pre-announcement conditions we're working under, I only care about which ballots were public before results were released. Therefore, I use Darren Viola's now-defunct HOF Ballot Collecting Gizmo rather than Ryan's spreadsheet for these historical numbers. (All historical exit-poll data can be found in this spreadsheet.) For players who have only been on the ballot for two years or one year, I take the straight average of their public-private differentials for however long they've been on the ballot. Ergo, Mike Mussina's adjustment factor is the average of his −16.8% from 2015 and his −10.2% from 2014, and Gary Sheffield's is simply just his 2015 differential of +3.0%.
As explained above, I then add or subtract each player's adjustment factor to or from his percentage in the BBHOF Tracker to arrive at an estimate for how this year's private ballots will treat him. Then I combine the private and public counts proportionally based on how many public ballots are known and how many private ballots are expected. Because of last year's purge of the Hall of Fame voter rolls, lower turnout is expected this year: Hall watchers generally agree that about 450 of the now-475 eligible voters will cast ballots. Therefore, if there are 200 public ballots, my projections assume 250 private ballots will be cast. To take an example, if a player is at 30% among public ballots but I project him at 45% among private ones, I would combine those proportionally at a four-to-five ratio (i.e., 200/250) for a final vote projection of 38.3%. This is why, as public ballots become a greater and greater share of the total electorate, my projections get ever closer to the raw polls.
This leaves the ballot's first-time candidates—Ken Griffey, Trevor Hoffman, Billy Wagner, Jim Edmonds, and several non-serious candidates—to reckon with. Because they don't have any vote history of their own to go off, I use the next best thing—the vote history of similar candidates. The polls also allow us to see what kind of voter best correlates with Hoffman voters, Wagner voters, etc., in a phenomenon similar to genetic linkage; if certain candidates are regularly paired up, they probably share the same type of supporter. Sam Miller identified three general strains of voter (think of them as political parties) in an analysis earlier this year; it's the same concept.
The returning candidate most closely correlated with Hoffman is Smith. This is not exactly a shock. The two candidates are extremely similar: they are both relief pitchers, and the best argument for their induction is their prodigious saves totals. The type of voter who would be swayed by saves, and who rejects sabermetric arguments that relief pitchers are not valuable because of how many fewer innings they pitch than starters, would be expected to vote eagerly for Hoffman and Smith in equal measure. Indeed, as of January 2, 88.4% of public Smith voters voted for Hoffman, while just 51.5% of public non-Smith voters did. Meanwhile, 43.2% of public Hoffman voters voted for Smith, while just 9.6% of public non-Hoffman voters did. That's as strong a correlation as you'll find this side of Bonds and Roger Clemens.
Rather than devise an adjustment factor for Hoffman, I calculate his estimated share of private ballots directly under the assumption that the same percentages of Smith voters and of non-Smith voters support Hoffman across the board. For instance, as of January 2, Smith's estimated performance on private ballots was 43.4%; 88.4% of those 43.4% are assumed to support Hoffman, and 51.5% of the remaining 56.6% do too—yielding a private-ballot haul of 67.5% for Hoffman. Once his private performance is calculated, I combine it with his public votes in the same way as every other candidate to get a final projection.
I follow the same logic with Wagner. The lefty is an odd case; he is a relief pitcher, which means it is hard to get new-age voters to take his candidacy seriously—but his Hall of Fame case, less reliant on saves and more on strikeouts and win probability added, is best appreciated with the use of advanced metrics. That leaves him in no-man's land; in political terms, he lacks a true base. He does not correlate especially well with Hoffman and Smith voters, who disdain his smaller number of saves. But he also does not correlate with stathead favorites like Schilling or Jeff Bagwell. It turns out that his closest relationship is an inverse one with three misfit sluggers: Mark McGwire, Sheffield, and Sammy Sosa. In fact, it's a perfect relationship: so far in public ballots, no one who has voted for one of these three men has also voted for Wagner. Because they are the largest sample size, I used McGwire voters and non-McGwire voters to calculate Wagner's popularity on private ballots, but the results are very similar if you use Sheffield or Sosa instead.
We come to a problem when we try to apply this technique to Griffey and Edmonds, however. So far, Griffey is the unanimous choice of public ballots—so a linkage analysis is useless. One hundred percent of everybody's voters voted for Griffey, and there are no non-Griffey voters whose preferences to survey. Edmonds, meanwhile, has the opposite problem—he has scrounged up just four public votes so far, not a big enough sample size to extrapolate from. For these two, then, I had to calculate adjustment factors a different way: by looking back at past candidates to find a good analog.
Griffey is a well-liked, PED-free superstar with an undeniable statistical case for the Hall. How have those types of first-time candidates fared in the last few years? Randy Johnson slipped −2.0% from public to private. Pedro Martínez lost −10.4%. John Smoltz's factor was −5.5%. Greg Maddux's was −3.6%, Tom Glavine's was −5.9%, and Frank Thomas was −8.3%. In 2013, his first year on the ballot, Craig Biggio went down −2.9%. Clearly, there are some voters who are simply ignorant, contrarian, or both. They cause Griffey-esque candidates to fall an average −5.5%—so that's his adjustment factor.
Edmonds, meanwhile, was a hard-nosed competitor with a gaudy reputation but stats that are much less obviously Hall-worthy. So far, the electorate appears to be treating him similarly to the glut of borderline candidates swimming against the tide to avoid the 5% elimination threshold. In their first years on the ballot, comparable players Garciaparra (+5.6%), Carlos Delgado (+3.7%), Sheffield (+3.0%), Jeff Kent (−0.9%), and Sosa (−1.4%) mostly gained votes from private balloters. I took their average of +2.0% as Edmonds's adjustment factor.
Finally, I've judged Garret Anderson (who, yes, has actually gotten one public vote so far), Jason Kendall, and the other clear long shots to be non-competitive—that is, they won't get more than a handful of throwaway votes, as happens every year for the ballot's weakest links. (Aaron Boone got two votes last year!) Because there's little point in trying to predict whether Mike Lowell will get one vote or two, everyone in this category simply has an adjustment factor of zero.
These methods have served me well in the past. But this year is an exceptional one because of the unknown effect of the purge of the voter rolls. How much of a wrench will it throw into my work? We already know that the purge will drastically change turnout this year, and estimating turnout accurately is important to my methodology. The initial guess of 450 seems logical, but really we have no idea—and we don't have precedent to back that estimate up. It's also very possible that the purge fundamentally altered the difference between public and private voters. Many have speculated that conservative private voters were disproportionately purged—the retired baseball writers who no longer have a newspaper column (or are too old-fashioned to have a Twitter account) with which to share their ballots. This could mean that the remaining stock of private ballots resembles the public snapshot a lot more closely than in years past. On the other hand, we know of plenty conservative, formerly public voters who were purged this year. Maybe the purge affected both public and private equally, and the adjustment factors will largely hold up.
I admit I'm unsure. But I know better than to construct a new methodology based entirely on theoretical suppositions rather than one that has been proven accurate in the past. I'm bracing myself for unpredictability, and, once known, the effects of the purge may cause me to tweak my methodology next year. But for now, I'm sticking with what I know.