On a recent trip into rural Massachusetts, I spotted multiple Donald Trump signs. The conservative mogul has held three rallies in the Bay State and drawn thousands of spectators each time. And in primary polls of the state, Trump leads by an even bigger margin—an average of 25 percentage points!—than he does in the rest of the country. Against all odds, Massachusetts loves Donald Trump.
This flies in the face of the popular view of Massachusetts as a liberal bastion, even for Republicans. For decades, Massachusetts Republicans have built a reputation as moderate, socially liberal, pro-business consensus-seekers. But while this may accurately describe members of the Massachusetts Republican establishment (such as Governor Charlie Baker), it’s an outdated way of looking at the Republican electorate in the state.
Since 2010, Republican candidates for president, senator, or governor have averaged 44.1% of the vote in Massachusetts—a more vocal minority than you might assume. More than one breed of Republican is needed to build a coalition of this size. Broadly speaking, the Massachusetts GOP today relies on three types of voters: fiscally conservative business elites, socially conservative blue-collar workers, and rural, anti-government culture warriors. These last two groups are Trump’s sweet spot.
The state’s financial elite—and like-minded, pro-business voters—were historically loyal Republicans. From 1920 through 1956, their predictability in presidential elections made Massachusetts an average of 2.1 points more Republican than the nation as a whole. But the election of one of their own in 1960, coupled with the rise of social conservatism from Barry Goldwater through George W. Bush, compelled many of these moderates to reconsider their affiliation.
Today, educated, upper-class Bay Staters are better considered independents and often split their tickets. Most are socially liberal and usually vote for Democrats in presidential elections as social issues have dominated the national stage. But they do occasionally vote for Republicans in state elections, where social issues less often drown on economic ones. These voters were largely responsible for Baker’s election in 2014.
As a coarse, bomb-throwing populist, Trump holds little appeal to these voters. But he benefits from the fact that the wealthy are now a smaller part of the state’s Republican base than ever before. The public faces of the Massachusetts Republican Party, from Baker to former Governor Mitt Romney to members of the state legislature, may remain pulled from this pool of elites, but they are increasingly feeling the heat from conservative activists.
Similarly, as the state GOP has bled establishment voters, its electorate has been distilled to a more hardline crowd. Bay Staters who identify as Republicans today are more likely to be working-class or rural voters who see the world as leaving them behind.
White, blue-collar workers in Boston and the Gateway Cities, along with their middle-class counterparts in the suburbs, have voted Democratic for generations thanks to their economic liberalism and association with organized labor. But this demographic, many of whom belong to the Catholic Church, can also be conservative on issues like abortion, gay rights, and immigration. While many remain loyal Democratic voters, others have defected to support Republicans as the Democratic Party begins to look less like them and more like the rising American electorate of minorities, millennials, progressives, and more. These voters explain the occasional collapse of Democrats in lower-income urban areas they have traditionally won, such as Worcester and Lowell—not coincidentally, the hosts of Trump's two biggest rallies in the state.
These voters' counterparts in rural areas constitute Massachusetts’s chapter of the Tea Party. They are both socially and economically conservative, preferring that government not interfere with their bucolic existence. But they are also culturally conservative, favoring gun rights, venerating the military, and disdaining urban elites of both parties. (These are the voters Scott Brown targeted with his famous pickup truck.) They disapprove of how quickly and in what direction the country is moving and long for a simpler America closer to the Jeffersonian ideal. This demographic is represented by wide swaths of historically Republican towns on the South Shore and in Central Massachusetts, including Tyngsborough, the site of Trump's third Massachusetts rally.
When these two demographics vote in Republican primaries, they back the candidates who speak to their anti-establishment values. Despite his background as a business elite, Trump has assumed that mantle in this election. He has earned a faithful following by railing against immigration, Obamacare, Common Core, lobbyists, and other symbols of elite control of government. His antagonistic style—calling political opponents “losers” and “idiots”—resonates with the anger these voters feel at an allegedly rotten political system. The informal language of his speeches (a study found that Trump speaks at an average of a fourth-grade level) also conforms to these relatively uneducated voting blocs. And his slogan, “Make America Great Again,” is also their most fervent wish.
Trump’s success is the canary in the coal mine for moderate Republicans in Massachusetts. While they can still win general elections with the help of independents, primary elections are decided by partisans, even in an open-primary system like Massachusetts has. And as moderates have shied away from identifying with the GOP, the party has become ideologically purer. The purer it gets, the better Trump will do.
Massachusetts has had a dearth of competitive Republican primaries since the national emergence of the anti-establishment right in 2009–2010, so the two camps have yet to meet on the battlefield. With a March 1 primary date in 2016, it remains uncertain whether the Republican field will still be competitive by the time Massachusetts votes. In the past, candidates like Trump who peaked too early have often faded in favor of the inevitable nominee by the time ballots are cast—so the Massachusetts primary may again be anti-climactic.
But Trump is a sign that old assumptions about the Massachusetts electorate no longer hold in 2016. The Republican base in the state has much more in common with the rest of the country than Massachusetts’s blue veneer lets on. Put another way, “I’m a Republican from Massachusetts” no longer necessarily means “I’m a Massachusetts Republican.”
Tuesday, January 26, 2016
How Trump Dominates in America's Bluest State
Sunday, January 10, 2016
State of the State Schedule 2016
President Obama has promised a "non-traditional" State of the Union address this January 12. Pshaw, say observers of state politics. POTUS wants non-traditional? How about a PowerPoint presentation? Ad libbing? A barnstorming tour of the state? These are all tactics that marked State of the State speeches last year. These understudied orations give a window into the coming year in politics—and 2016 will be one of the busiest ever. Instead of tuning into Obama's lame-duck address, spare a minute for the governors who are at the height of their governing powers. Here's something you won't hear from anyone in Congress: most of the lawmaking that affects you—from the Religious Freedom Restoration Act to automatic voter registration, from new gun-control laws to right-to-work—happens in the states. Check below to see when your governor will speak.
Alabama: February 2 at 6:30pm ET
Alaska: January 21 at 7pm AKT
Arizona: January 11 at 2pm MT
Arkansas: No speech in even-numbered years
California: January 21 at 10am PT
Colorado: January 14 at 11am MT
Connecticut: February 3 at noon ET
Delaware: January 21 at 2pm ET
Florida: January 12 at 11am ET
Georgia: January 13 at 11am ET
Hawaii: January 25 at 10am HAT
Idaho: January 11 at 1pm MT
Illinois: January 27 at noon CT (State of the State); February 17 at noon CT (budget address)
Indiana: January 12 at 7pm ET
Iowa: January 12 at 10am CT
Kansas: January 12 at 5:30pm CT
Kentucky: December 8 at 2pm ET (inaugural); January 26 at 7pm ET (State of the Commonwealth Budget Address)
Louisiana: January 11 at noon CT (inaugural); February 11 at 6:30pm CT (televised address); February 14 at 5pm CT (budget address); March 14 at 1pm CT (State of the State)
Maine: No speech in 2016
Maryland: February 3 at noon ET
Massachusetts: January 21 at 7pm ET
Michigan: January 19 at 7pm ET
Minnesota: March 9 at 7pm CT
Mississippi: January 12 at 11am CT (inaugural); January 26 at 5:30pm CT (State of the State)
Missouri: January 20 at 7pm CT
Montana: No speech in even-numbered years
Nebraska: January 14 at 10am CT
Nevada: No speech in even-numbered years
New Hampshire: February 4 at 1:30pm ET
New Jersey: January 12 at 3pm ET (State of the State); February 16 at 2pm ET (budget address)
New Mexico: January 19 at 1pm MT
New York: January 13 at 12:30pm ET
North Carolina: No speech in even-numbered years
North Dakota: No speech in even-numbered years
Ohio: April 6 at 7pm ET
Oklahoma: February 1 at 12:30pm CT
Oregon: April 8 at noon PT
Pennsylvania: February 9 at 11:30am ET (budget address)
Rhode Island: February 2 at 7pm ET
South Carolina: January 20 at 7pm ET
South Dakota: December 8 at 1pm CT (budget address); January 12 at 1pm CT (State of the State)
Tennessee: February 1 at 6pm CT
Texas: No speech in even-numbered years
Utah: January 27 at 6:30pm MT
Vermont: January 7 at 2pm ET
Virginia: January 13 at 7pm ET
Washington: January 12 at noon PT
West Virginia: January 13 at 7pm ET
Wisconsin: January 19 at 7pm CT
Wyoming: February 8 at 10am MT
National: January 12 at 9pm ET
Alabama: February 2 at 6:30pm ET
Alaska: January 21 at 7pm AKT
Arizona: January 11 at 2pm MT
Arkansas: No speech in even-numbered years
California: January 21 at 10am PT
Colorado: January 14 at 11am MT
Connecticut: February 3 at noon ET
Delaware: January 21 at 2pm ET
Florida: January 12 at 11am ET
Georgia: January 13 at 11am ET
Hawaii: January 25 at 10am HAT
Idaho: January 11 at 1pm MT
Illinois: January 27 at noon CT (State of the State); February 17 at noon CT (budget address)
Indiana: January 12 at 7pm ET
Iowa: January 12 at 10am CT
Kansas: January 12 at 5:30pm CT
Kentucky: December 8 at 2pm ET (inaugural); January 26 at 7pm ET (State of the Commonwealth Budget Address)
Louisiana: January 11 at noon CT (inaugural); February 11 at 6:30pm CT (televised address); February 14 at 5pm CT (budget address); March 14 at 1pm CT (State of the State)
Maine: No speech in 2016
Maryland: February 3 at noon ET
Massachusetts: January 21 at 7pm ET
Michigan: January 19 at 7pm ET
Minnesota: March 9 at 7pm CT
Mississippi: January 12 at 11am CT (inaugural); January 26 at 5:30pm CT (State of the State)
Missouri: January 20 at 7pm CT
Montana: No speech in even-numbered years
Nebraska: January 14 at 10am CT
Nevada: No speech in even-numbered years
New Hampshire: February 4 at 1:30pm ET
New Jersey: January 12 at 3pm ET (State of the State); February 16 at 2pm ET (budget address)
New Mexico: January 19 at 1pm MT
New York: January 13 at 12:30pm ET
North Carolina: No speech in even-numbered years
North Dakota: No speech in even-numbered years
Ohio: April 6 at 7pm ET
Oklahoma: February 1 at 12:30pm CT
Oregon: April 8 at noon PT
Pennsylvania: February 9 at 11:30am ET (budget address)
Rhode Island: February 2 at 7pm ET
South Carolina: January 20 at 7pm ET
South Dakota: December 8 at 1pm CT (budget address); January 12 at 1pm CT (State of the State)
Tennessee: February 1 at 6pm CT
Texas: No speech in even-numbered years
Utah: January 27 at 6:30pm MT
Vermont: January 7 at 2pm ET
Virginia: January 13 at 7pm ET
Washington: January 12 at noon PT
West Virginia: January 13 at 7pm ET
Wisconsin: January 19 at 7pm CT
Wyoming: February 8 at 10am MT
National: January 12 at 9pm ET
Thursday, January 7, 2016
What We Learned From This Year's Hall of Fame Results
I lost the coin flip.
On Wednesday night, two players—Ken Griffey and Mike Piazza—were lucky enough to join the pantheon of baseball's elite. I had predicted that Jeff Bagwell, by literally the narrowest of margins (0.1 percentage points), would join them. At that margin, though, I knew it was a coin flip, and, ultimately, Bagwell fell short.
That miss soured a great year for my overall Hall of Fame projections. The mean and median error of my projections was just 1.5 percentage points—my best mark yet in four years of doing this, and almost half the average error I experienced last year. I was also much more consistent this year, nailing eight players' percentages within one point and missing by more than three points on just one player:
Unfortunately, that one player—my biggest miss—was Bagwell: the one player whose fate was actually uncertain, and therefore the one for whom people needed Hall of Fame models the most. My off-kilter shot at him was also particularly glaring because, in a world that focuses mostly on the binary of "elected" and "not elected," I wound up on the wrong side. I can't help but feel like I failed when the one player I misled people on was the one they cared about the most.
Bagwell's showing really and truly surprised me, however. His drop from public ballots to private ballots was a whopping −11.4 percentage points—even bigger than his drop last year, which itself was uncharacteristically large for him (prior to 2015, Bagwell did about the same among public and private voters). This was even more shocking because of this year's purge of the most conservative voters from the BBWAA rolls. I was worried that my projections might prove inaccurate because they overstated drops; never in a million years did I imagine that I would understate them.
Yet I did, and not just for Bagwell. Roger Clemens and Barry Bonds also both dropped by more this year than last—again, odd, because conventional wisdom was that a lot of anti-steroid moralizers had fallen victim to the purge. Alan Trammell also dropped among private voters, despite gaining among this population last year, but I wasn't fooled; because my projections account for the last three years of public-private deviations, I correctly foresaw the direction, if not quite the magnitude, of his drop. Meanwhile, Fred McGriff did the opposite, gaining with private voters this year after suffering from them last year. (This demonstrates why, despite many suggestions that my adjustment factors rely more heavily on recent years' election results, I have stuck with a straight multi-year average.)
That said, many candidates were indeed helped by the purge. Most obviously, candidates like Bagwell, Tim Raines, Edgar Martínez, and Mike Mussina gained huge amounts of ground from last year's totals. But Mussina, Raines, and Curt Schilling were also among the players whose historically huge drops from public to private ballots were blunted a little bit this year. Their increasing acceptance with the private electorate (still a majority of voters) is crucial for their hopes of eventually being elected.
The moral of the story is that, after all that wondering we did about how the purge might affect the public-private differential, it really didn't make a huge difference. Public and private ballots ended up being purged about equally.
Other loose ends: I was worried about predicting the two new relief pitchers on the ballot, Trevor Hoffman and Billy Wagner. I needn't have been. Both ended up following the familiar reliever pattern (see Lee Smith's +11.4 public-private differential this year, which was actually smaller than expected!) of gaining votes with private voters. In fact, I should have been even more aggressive in predicting this, as they were two of the rare players my projections lowballed.
Thank you to everyone who followed along with me this Hall of Fame season, especially those who helped, encouraged, or shared my work. I want to give credit where credit is due: First, any contribution I make to the science of Hall of Fame forecasting is due entirely to Ryan Thibodaux, without whose BBHOF Tracker of public ballots none of this would be possible. I owe a lot of my record accuracy to Ryan, who this year uncovered a far greater percentage of ballots (thus reducing the possible error) than ever before. And I am hardly alone in putting out Hall of Fame forecasts every year; two other forecasters of exceptional skill include Ben Dilday and Tom Tango. Ben's median error of 1.5 percentage points this year matched my own, and Tom's was not far behind at 2.4 points, despite both basing their predictions off a smaller sample of public ballots. If you like my projections, give them some love too.
On Wednesday night, two players—Ken Griffey and Mike Piazza—were lucky enough to join the pantheon of baseball's elite. I had predicted that Jeff Bagwell, by literally the narrowest of margins (0.1 percentage points), would join them. At that margin, though, I knew it was a coin flip, and, ultimately, Bagwell fell short.
That miss soured a great year for my overall Hall of Fame projections. The mean and median error of my projections was just 1.5 percentage points—my best mark yet in four years of doing this, and almost half the average error I experienced last year. I was also much more consistent this year, nailing eight players' percentages within one point and missing by more than three points on just one player:
Unfortunately, that one player—my biggest miss—was Bagwell: the one player whose fate was actually uncertain, and therefore the one for whom people needed Hall of Fame models the most. My off-kilter shot at him was also particularly glaring because, in a world that focuses mostly on the binary of "elected" and "not elected," I wound up on the wrong side. I can't help but feel like I failed when the one player I misled people on was the one they cared about the most.
Bagwell's showing really and truly surprised me, however. His drop from public ballots to private ballots was a whopping −11.4 percentage points—even bigger than his drop last year, which itself was uncharacteristically large for him (prior to 2015, Bagwell did about the same among public and private voters). This was even more shocking because of this year's purge of the most conservative voters from the BBWAA rolls. I was worried that my projections might prove inaccurate because they overstated drops; never in a million years did I imagine that I would understate them.
Yet I did, and not just for Bagwell. Roger Clemens and Barry Bonds also both dropped by more this year than last—again, odd, because conventional wisdom was that a lot of anti-steroid moralizers had fallen victim to the purge. Alan Trammell also dropped among private voters, despite gaining among this population last year, but I wasn't fooled; because my projections account for the last three years of public-private deviations, I correctly foresaw the direction, if not quite the magnitude, of his drop. Meanwhile, Fred McGriff did the opposite, gaining with private voters this year after suffering from them last year. (This demonstrates why, despite many suggestions that my adjustment factors rely more heavily on recent years' election results, I have stuck with a straight multi-year average.)
That said, many candidates were indeed helped by the purge. Most obviously, candidates like Bagwell, Tim Raines, Edgar Martínez, and Mike Mussina gained huge amounts of ground from last year's totals. But Mussina, Raines, and Curt Schilling were also among the players whose historically huge drops from public to private ballots were blunted a little bit this year. Their increasing acceptance with the private electorate (still a majority of voters) is crucial for their hopes of eventually being elected.
The moral of the story is that, after all that wondering we did about how the purge might affect the public-private differential, it really didn't make a huge difference. Public and private ballots ended up being purged about equally.
Other loose ends: I was worried about predicting the two new relief pitchers on the ballot, Trevor Hoffman and Billy Wagner. I needn't have been. Both ended up following the familiar reliever pattern (see Lee Smith's +11.4 public-private differential this year, which was actually smaller than expected!) of gaining votes with private voters. In fact, I should have been even more aggressive in predicting this, as they were two of the rare players my projections lowballed.
Thank you to everyone who followed along with me this Hall of Fame season, especially those who helped, encouraged, or shared my work. I want to give credit where credit is due: First, any contribution I make to the science of Hall of Fame forecasting is due entirely to Ryan Thibodaux, without whose BBHOF Tracker of public ballots none of this would be possible. I owe a lot of my record accuracy to Ryan, who this year uncovered a far greater percentage of ballots (thus reducing the possible error) than ever before. And I am hardly alone in putting out Hall of Fame forecasts every year; two other forecasters of exceptional skill include Ben Dilday and Tom Tango. Ben's median error of 1.5 percentage points this year matched my own, and Tom's was not far behind at 2.4 points, despite both basing their predictions off a smaller sample of public ballots. If you like my projections, give them some love too.
Labels:
Accountability,
Baseball,
Hall of Fame,
Number-Crunching,
Predictions
Wednesday, January 6, 2016
2016 Hall of Fame: My Final Call
It's the moment of truth for Hall of Fame watchers. Tonight at 6pm Eastern, the results of this year's election will be announced. Exit polls of over 200 ballots, dutifully collected by Ryan Thibodaux, provide hope for some candidates and great suspense for others—but they can also mislead. As I explained last week, a polling "adjustment" is necessary to arrive at final projections of Hall vote totals.
For several years, my projections, based on historical deviation between public and private ballots, have been some of the most accurate estimates around. With 208 of an estimated 450 precincts reporting, I now issue my not-quite-final-but-close-enough predictions for 2016. (I say not quite final because Ryan's ballot tracker may add handful more votes before 6pm, which I will add to my projections on this Google spreadsheet. Follow me on Twitter for real-time notifications.) In an extremely close election, I'm forecasting two players will be added to the baseball pantheon.
Ken Griffey Jr. and Mike Piazza are easy calls. At 100% and 86.1% of public ballots and projections of 97.0% and 82.0%, respectively, they have a large margin of error in case the polls for them are wrong. For a long time, I projected that they would be joined in the Class of 2016 by Jeff Bagwell, but he has been slipping steadily in the polls for the last several days. Just this morning, he passed under the 75% threshold in my projections (though he remains at 76.9% in Ryan's tracker.) However, every poll has a margin of error, and at 74.4%, my Bagwell projection is easily within that range. It's no exaggeration to say that this is the closest Hall of Fame election we've witnessed in the exit-poll era. While my model does expect him to fall just short, more fundamentally it is akin to when a polling model says an election is 50% to 50%: it's too close to call.
Tim Raines is an easier call, in my opinion. Though he is at a similar 76.0% in the exit polls, he has historically lost much more ground than Bagwell once private ballots are accounted for, with drops of −13.4%, −13.3%, and −11.5% the last three years, yielding a projected adjustment factor this year of −12.7%. Bagwell, meanwhile, has seen deviations of −10.7%, −3.5%, and +0.4% the last three years. Many observers dismiss Bagwell's chances of election this year because they expect his vote totals to drop an amount similar to Raines's, as happened last year, but I think this is a hasty assumption. Bagwell was never anathema to private voters the way Raines was until last year, meaning we should probably treat that as an outlier. This is even more true because of the purge of many of the BBWAA's most conservative voters this year. I find it unlikely that Bagwell's drop among private voters will do anything but decrease from last year's figure after this reform.
The purge is already likely responsible for the huge increases that my model predicts for Bagwell (+18.7 percentage points), Raines (+14.2 points), and many others, especially Edgar Martínez (+19.4 points), Mike Mussina (+19.5 points), and Alan Trammell (+17.5 points). These would be among the biggest year-to-year improvements in Hall of Fame history; for them all to happen in the same year is no mere coincidence. Trammell is in his 15th and final year on the ballot, so he will drop off next year regardless, but this year is poised to put not only Bagwell and Raines, but Martínez (forecast at 46.4% in his seventh year), Mussina (forecast at 44.1% in his third year), Trevor Hoffman (forecast at 64.8% in his first year), and Curt Schilling (forecast at 54.5% in his fourth year) in great position for enshrinement sometime in the future. Despite wide speculation that many of the most virulently anti-PED voters were also purged, steroid poster boys Barry Bonds and Roger Clemens are expected to gain only 5.7 and 4.3 points, respectively, from last year's totals—important boosts, but not really out of the ordinary in the historical record.
At the bottom of the ballot, I project that two notable players will drop off the ballot by failing to reach 5% support: Nomar Garciaparra and Jim Edmonds. However, both are expected to have more backing among private ballots than among public ones, and so an invisible wave of support could save either one. When dealing with such small sample sizes (an estimated 23 votes are necessary for a player to survive onto next year's ballot), a small error can make a big difference. One of my personal favorite players, Billy Wagner, is also at some risk of dropping off, although I predict he will safely hit the cutoff with 8.9%. Other than the obvious, one of the things I'm most curious about tonight is how Wagner and his fellow ballot-rookie closer, Hoffman, fare compared to their standing in the polls. I wrestled with this a lot for my projections and ended up deciding that Hoffman will overperform by about five points, while Wagner's haul would remain steady.
Below are my full projections. My full methodology can be read here. Good luck to all the candidates. NOTE: The numbers below were updated as of 5:55pm on January 6 to reflect all 213 ballots made public before the announcement. They no longer match the text above.
For several years, my projections, based on historical deviation between public and private ballots, have been some of the most accurate estimates around. With 208 of an estimated 450 precincts reporting, I now issue my not-quite-final-but-close-enough predictions for 2016. (I say not quite final because Ryan's ballot tracker may add handful more votes before 6pm, which I will add to my projections on this Google spreadsheet. Follow me on Twitter for real-time notifications.) In an extremely close election, I'm forecasting two players will be added to the baseball pantheon.
Ken Griffey Jr. and Mike Piazza are easy calls. At 100% and 86.1% of public ballots and projections of 97.0% and 82.0%, respectively, they have a large margin of error in case the polls for them are wrong. For a long time, I projected that they would be joined in the Class of 2016 by Jeff Bagwell, but he has been slipping steadily in the polls for the last several days. Just this morning, he passed under the 75% threshold in my projections (though he remains at 76.9% in Ryan's tracker.) However, every poll has a margin of error, and at 74.4%, my Bagwell projection is easily within that range. It's no exaggeration to say that this is the closest Hall of Fame election we've witnessed in the exit-poll era. While my model does expect him to fall just short, more fundamentally it is akin to when a polling model says an election is 50% to 50%: it's too close to call.
Tim Raines is an easier call, in my opinion. Though he is at a similar 76.0% in the exit polls, he has historically lost much more ground than Bagwell once private ballots are accounted for, with drops of −13.4%, −13.3%, and −11.5% the last three years, yielding a projected adjustment factor this year of −12.7%. Bagwell, meanwhile, has seen deviations of −10.7%, −3.5%, and +0.4% the last three years. Many observers dismiss Bagwell's chances of election this year because they expect his vote totals to drop an amount similar to Raines's, as happened last year, but I think this is a hasty assumption. Bagwell was never anathema to private voters the way Raines was until last year, meaning we should probably treat that as an outlier. This is even more true because of the purge of many of the BBWAA's most conservative voters this year. I find it unlikely that Bagwell's drop among private voters will do anything but decrease from last year's figure after this reform.
The purge is already likely responsible for the huge increases that my model predicts for Bagwell (+18.7 percentage points), Raines (+14.2 points), and many others, especially Edgar Martínez (+19.4 points), Mike Mussina (+19.5 points), and Alan Trammell (+17.5 points). These would be among the biggest year-to-year improvements in Hall of Fame history; for them all to happen in the same year is no mere coincidence. Trammell is in his 15th and final year on the ballot, so he will drop off next year regardless, but this year is poised to put not only Bagwell and Raines, but Martínez (forecast at 46.4% in his seventh year), Mussina (forecast at 44.1% in his third year), Trevor Hoffman (forecast at 64.8% in his first year), and Curt Schilling (forecast at 54.5% in his fourth year) in great position for enshrinement sometime in the future. Despite wide speculation that many of the most virulently anti-PED voters were also purged, steroid poster boys Barry Bonds and Roger Clemens are expected to gain only 5.7 and 4.3 points, respectively, from last year's totals—important boosts, but not really out of the ordinary in the historical record.
At the bottom of the ballot, I project that two notable players will drop off the ballot by failing to reach 5% support: Nomar Garciaparra and Jim Edmonds. However, both are expected to have more backing among private ballots than among public ones, and so an invisible wave of support could save either one. When dealing with such small sample sizes (an estimated 23 votes are necessary for a player to survive onto next year's ballot), a small error can make a big difference. One of my personal favorite players, Billy Wagner, is also at some risk of dropping off, although I predict he will safely hit the cutoff with 8.9%. Other than the obvious, one of the things I'm most curious about tonight is how Wagner and his fellow ballot-rookie closer, Hoffman, fare compared to their standing in the polls. I wrestled with this a lot for my projections and ended up deciding that Hoffman will overperform by about five points, while Wagner's haul would remain steady.
Below are my full projections. My full methodology can be read here. Good luck to all the candidates. NOTE: The numbers below were updated as of 5:55pm on January 6 to reflect all 213 ballots made public before the announcement. They no longer match the text above.
Labels:
Baseball,
Hall of Fame,
Number-Crunching,
Predictions
Saturday, January 2, 2016
How to Interpret the Polls of the 2016 Baseball Hall of Fame Election
2016 will be a year of elections—culminating with the selection of our next president, and beginning with something even more important: the Baseball Hall of Fame. On January 6 at 6pm Eastern, we'll learn which players join the pantheon of the game's best—but that doesn't mean it has to be a surprise. A healthy proportion of the Hall of Fame election's 450 votes or so are known in advance, thanks to the dogged work of Ryan Thibodaux. Ryan's addicting BBHOF Tracker aggregates ballots as they are published by their casters: the media members of the BBWAA. By Election Day, the Tracker has historically sampled as much as a third of the electorate—in essence, functioning as an "exit poll" for the referendum.
But we must be careful not to take this poll as gospel. Even the best polls carry margins of error, and Ryan will be the first to tell you that his spreadsheet represents merely data—not a projection. That's where I come in.
For the past several years, I've reweighted raw Hall of Fame data like the BBHOF Tracker's to arrive at scientific projections of the final results. Last year, my projections proved twice as accurate as the raw polls and even outperformed other prominent Hall of Fame forecasters, including one of my heroes, Tom Tango. My methodology simulates the work of political pollsters, who start by surveying the electorate but must weight these raw results using demographics, vote history, and other factors to get final, maximally accurate numbers. (This pollster "skewing" was what many Republicans, including the creator of UnskewedPolls.com, complained about in 2012 when raw data showed Romney ahead but smoothing the data gave Obama the lead—and pollsters' methods were validated when Obama, of course, won by the predicted margin.)
In the case of the Hall of Fame, a certain type of voter is more likely to "respond" to the exit poll by revealing their ballot for input in the Tracker. This median public voter is more stat-savvy, cares less about steroid use, and uses up more spots on the ballot. Editorializing a bit here, they also tend to more carefully and fairly consider their votes—a requirement that goes hand in hand with their obvious belief in the transparency of the process. Meanwhile, voters who choose to stay private tend to base their decisions on narrative, prefer stats like saves and RBI, and are more likely to invoke the character clause against PED users. They are more at peace with voting for just a few players using their own personal, subjective standards—often in patterns that seem random to the rest of us.
To account for the exit poll's oversampling of so-called "progressive" BBWAA members, we have to adjust the raw data for each player up or down. Comparing past exit polls with the final results, it's plain that candidates such as Tim Raines, Barry Bonds, and Curt Schilling are consistently overstated by the polls, while players like Lee Smith, Larry Walker, and Nomar Garciaparra are lowballed by them. I use these historical deviations to come up with an exit-poll "adjustment factor" for each player returning to the ballot (for first-time candidates, calculating the adjustment factor is more complicated—see below). Then, I simply add or subtract each player's adjustment factor to or from his percentage in the BBHOF Tracker to arrive at my projections.
Below are the current projections along with their underlying numbers. To reflect the new polling data that Ryan collects on a rolling basis, I'll update these projections daily on Twitter and in this Google spreadsheet until the results are announced. These projections—and the polls—get more accurate for each additional ballot released. UPDATE: These are my final projections, issued at 5:55pm on January 6.
For those of you who are interested in the exact methodology of these projections, read on. To calculate the adjustment factor for returning players, I took a straight average of each player's differential between public and private ballots over the last three years. For example, Mike Piazza's differential was −9.8% in 2015, −9.1% in 2014, and −3.8% in 2013, so his adjustment factor is −7.6%. Importantly for drawing the distinction between public and private ballots, in order to best simulate the pre-announcement conditions we're working under, I only care about which ballots were public before results were released. Therefore, I use Darren Viola's now-defunct HOF Ballot Collecting Gizmo rather than Ryan's spreadsheet for these historical numbers. (All historical exit-poll data can be found in this spreadsheet.) For players who have only been on the ballot for two years or one year, I take the straight average of their public-private differentials for however long they've been on the ballot. Ergo, Mike Mussina's adjustment factor is the average of his −16.8% from 2015 and his −10.2% from 2014, and Gary Sheffield's is simply just his 2015 differential of +3.0%.
As explained above, I then add or subtract each player's adjustment factor to or from his percentage in the BBHOF Tracker to arrive at an estimate for how this year's private ballots will treat him. Then I combine the private and public counts proportionally based on how many public ballots are known and how many private ballots are expected. Because of last year's purge of the Hall of Fame voter rolls, lower turnout is expected this year: Hall watchers generally agree that about 450 of the now-475 eligible voters will cast ballots. Therefore, if there are 200 public ballots, my projections assume 250 private ballots will be cast. To take an example, if a player is at 30% among public ballots but I project him at 45% among private ones, I would combine those proportionally at a four-to-five ratio (i.e., 200/250) for a final vote projection of 38.3%. This is why, as public ballots become a greater and greater share of the total electorate, my projections get ever closer to the raw polls.
This leaves the ballot's first-time candidates—Ken Griffey, Trevor Hoffman, Billy Wagner, Jim Edmonds, and several non-serious candidates—to reckon with. Because they don't have any vote history of their own to go off, I use the next best thing—the vote history of similar candidates. The polls also allow us to see what kind of voter best correlates with Hoffman voters, Wagner voters, etc., in a phenomenon similar to genetic linkage; if certain candidates are regularly paired up, they probably share the same type of supporter. Sam Miller identified three general strains of voter (think of them as political parties) in an analysis earlier this year; it's the same concept.
The returning candidate most closely correlated with Hoffman is Smith. This is not exactly a shock. The two candidates are extremely similar: they are both relief pitchers, and the best argument for their induction is their prodigious saves totals. The type of voter who would be swayed by saves, and who rejects sabermetric arguments that relief pitchers are not valuable because of how many fewer innings they pitch than starters, would be expected to vote eagerly for Hoffman and Smith in equal measure. Indeed, as of January 2, 88.4% of public Smith voters voted for Hoffman, while just 51.5% of public non-Smith voters did. Meanwhile, 43.2% of public Hoffman voters voted for Smith, while just 9.6% of public non-Hoffman voters did. That's as strong a correlation as you'll find this side of Bonds and Roger Clemens.
Rather than devise an adjustment factor for Hoffman, I calculate his estimated share of private ballots directly under the assumption that the same percentages of Smith voters and of non-Smith voters support Hoffman across the board. For instance, as of January 2, Smith's estimated performance on private ballots was 43.4%; 88.4% of those 43.4% are assumed to support Hoffman, and 51.5% of the remaining 56.6% do too—yielding a private-ballot haul of 67.5% for Hoffman. Once his private performance is calculated, I combine it with his public votes in the same way as every other candidate to get a final projection.
I follow the same logic with Wagner. The lefty is an odd case; he is a relief pitcher, which means it is hard to get new-age voters to take his candidacy seriously—but his Hall of Fame case, less reliant on saves and more on strikeouts and win probability added, is best appreciated with the use of advanced metrics. That leaves him in no-man's land; in political terms, he lacks a true base. He does not correlate especially well with Hoffman and Smith voters, who disdain his smaller number of saves. But he also does not correlate with stathead favorites like Schilling or Jeff Bagwell. It turns out that his closest relationship is an inverse one with three misfit sluggers: Mark McGwire, Sheffield, and Sammy Sosa. In fact, it's a perfect relationship: so far in public ballots, no one who has voted for one of these three men has also voted for Wagner. Because they are the largest sample size, I used McGwire voters and non-McGwire voters to calculate Wagner's popularity on private ballots, but the results are very similar if you use Sheffield or Sosa instead.
We come to a problem when we try to apply this technique to Griffey and Edmonds, however. So far, Griffey is the unanimous choice of public ballots—so a linkage analysis is useless. One hundred percent of everybody's voters voted for Griffey, and there are no non-Griffey voters whose preferences to survey. Edmonds, meanwhile, has the opposite problem—he has scrounged up just four public votes so far, not a big enough sample size to extrapolate from. For these two, then, I had to calculate adjustment factors a different way: by looking back at past candidates to find a good analog.
Griffey is a well-liked, PED-free superstar with an undeniable statistical case for the Hall. How have those types of first-time candidates fared in the last few years? Randy Johnson slipped −2.0% from public to private. Pedro Martínez lost −10.4%. John Smoltz's factor was −5.5%. Greg Maddux's was −3.6%, Tom Glavine's was −5.9%, and Frank Thomas was −8.3%. In 2013, his first year on the ballot, Craig Biggio went down −2.9%. Clearly, there are some voters who are simply ignorant, contrarian, or both. They cause Griffey-esque candidates to fall an average −5.5%—so that's his adjustment factor.
Edmonds, meanwhile, was a hard-nosed competitor with a gaudy reputation but stats that are much less obviously Hall-worthy. So far, the electorate appears to be treating him similarly to the glut of borderline candidates swimming against the tide to avoid the 5% elimination threshold. In their first years on the ballot, comparable players Garciaparra (+5.6%), Carlos Delgado (+3.7%), Sheffield (+3.0%), Jeff Kent (−0.9%), and Sosa (−1.4%) mostly gained votes from private balloters. I took their average of +2.0% as Edmonds's adjustment factor.
Finally, I've judged Garret Anderson (who, yes, has actually gotten one public vote so far), Jason Kendall, and the other clear long shots to be non-competitive—that is, they won't get more than a handful of throwaway votes, as happens every year for the ballot's weakest links. (Aaron Boone got two votes last year!) Because there's little point in trying to predict whether Mike Lowell will get one vote or two, everyone in this category simply has an adjustment factor of zero.
These methods have served me well in the past. But this year is an exceptional one because of the unknown effect of the purge of the voter rolls. How much of a wrench will it throw into my work? We already know that the purge will drastically change turnout this year, and estimating turnout accurately is important to my methodology. The initial guess of 450 seems logical, but really we have no idea—and we don't have precedent to back that estimate up. It's also very possible that the purge fundamentally altered the difference between public and private voters. Many have speculated that conservative private voters were disproportionately purged—the retired baseball writers who no longer have a newspaper column (or are too old-fashioned to have a Twitter account) with which to share their ballots. This could mean that the remaining stock of private ballots resembles the public snapshot a lot more closely than in years past. On the other hand, we know of plenty conservative, formerly public voters who were purged this year. Maybe the purge affected both public and private equally, and the adjustment factors will largely hold up.
I admit I'm unsure. But I know better than to construct a new methodology based entirely on theoretical suppositions rather than one that has been proven accurate in the past. I'm bracing myself for unpredictability, and, once known, the effects of the purge may cause me to tweak my methodology next year. But for now, I'm sticking with what I know.
But we must be careful not to take this poll as gospel. Even the best polls carry margins of error, and Ryan will be the first to tell you that his spreadsheet represents merely data—not a projection. That's where I come in.
For the past several years, I've reweighted raw Hall of Fame data like the BBHOF Tracker's to arrive at scientific projections of the final results. Last year, my projections proved twice as accurate as the raw polls and even outperformed other prominent Hall of Fame forecasters, including one of my heroes, Tom Tango. My methodology simulates the work of political pollsters, who start by surveying the electorate but must weight these raw results using demographics, vote history, and other factors to get final, maximally accurate numbers. (This pollster "skewing" was what many Republicans, including the creator of UnskewedPolls.com, complained about in 2012 when raw data showed Romney ahead but smoothing the data gave Obama the lead—and pollsters' methods were validated when Obama, of course, won by the predicted margin.)
In the case of the Hall of Fame, a certain type of voter is more likely to "respond" to the exit poll by revealing their ballot for input in the Tracker. This median public voter is more stat-savvy, cares less about steroid use, and uses up more spots on the ballot. Editorializing a bit here, they also tend to more carefully and fairly consider their votes—a requirement that goes hand in hand with their obvious belief in the transparency of the process. Meanwhile, voters who choose to stay private tend to base their decisions on narrative, prefer stats like saves and RBI, and are more likely to invoke the character clause against PED users. They are more at peace with voting for just a few players using their own personal, subjective standards—often in patterns that seem random to the rest of us.
To account for the exit poll's oversampling of so-called "progressive" BBWAA members, we have to adjust the raw data for each player up or down. Comparing past exit polls with the final results, it's plain that candidates such as Tim Raines, Barry Bonds, and Curt Schilling are consistently overstated by the polls, while players like Lee Smith, Larry Walker, and Nomar Garciaparra are lowballed by them. I use these historical deviations to come up with an exit-poll "adjustment factor" for each player returning to the ballot (for first-time candidates, calculating the adjustment factor is more complicated—see below). Then, I simply add or subtract each player's adjustment factor to or from his percentage in the BBHOF Tracker to arrive at my projections.
Below are the current projections along with their underlying numbers. To reflect the new polling data that Ryan collects on a rolling basis, I'll update these projections daily on Twitter and in this Google spreadsheet until the results are announced. These projections—and the polls—get more accurate for each additional ballot released. UPDATE: These are my final projections, issued at 5:55pm on January 6.
For those of you who are interested in the exact methodology of these projections, read on. To calculate the adjustment factor for returning players, I took a straight average of each player's differential between public and private ballots over the last three years. For example, Mike Piazza's differential was −9.8% in 2015, −9.1% in 2014, and −3.8% in 2013, so his adjustment factor is −7.6%. Importantly for drawing the distinction between public and private ballots, in order to best simulate the pre-announcement conditions we're working under, I only care about which ballots were public before results were released. Therefore, I use Darren Viola's now-defunct HOF Ballot Collecting Gizmo rather than Ryan's spreadsheet for these historical numbers. (All historical exit-poll data can be found in this spreadsheet.) For players who have only been on the ballot for two years or one year, I take the straight average of their public-private differentials for however long they've been on the ballot. Ergo, Mike Mussina's adjustment factor is the average of his −16.8% from 2015 and his −10.2% from 2014, and Gary Sheffield's is simply just his 2015 differential of +3.0%.
As explained above, I then add or subtract each player's adjustment factor to or from his percentage in the BBHOF Tracker to arrive at an estimate for how this year's private ballots will treat him. Then I combine the private and public counts proportionally based on how many public ballots are known and how many private ballots are expected. Because of last year's purge of the Hall of Fame voter rolls, lower turnout is expected this year: Hall watchers generally agree that about 450 of the now-475 eligible voters will cast ballots. Therefore, if there are 200 public ballots, my projections assume 250 private ballots will be cast. To take an example, if a player is at 30% among public ballots but I project him at 45% among private ones, I would combine those proportionally at a four-to-five ratio (i.e., 200/250) for a final vote projection of 38.3%. This is why, as public ballots become a greater and greater share of the total electorate, my projections get ever closer to the raw polls.
This leaves the ballot's first-time candidates—Ken Griffey, Trevor Hoffman, Billy Wagner, Jim Edmonds, and several non-serious candidates—to reckon with. Because they don't have any vote history of their own to go off, I use the next best thing—the vote history of similar candidates. The polls also allow us to see what kind of voter best correlates with Hoffman voters, Wagner voters, etc., in a phenomenon similar to genetic linkage; if certain candidates are regularly paired up, they probably share the same type of supporter. Sam Miller identified three general strains of voter (think of them as political parties) in an analysis earlier this year; it's the same concept.
The returning candidate most closely correlated with Hoffman is Smith. This is not exactly a shock. The two candidates are extremely similar: they are both relief pitchers, and the best argument for their induction is their prodigious saves totals. The type of voter who would be swayed by saves, and who rejects sabermetric arguments that relief pitchers are not valuable because of how many fewer innings they pitch than starters, would be expected to vote eagerly for Hoffman and Smith in equal measure. Indeed, as of January 2, 88.4% of public Smith voters voted for Hoffman, while just 51.5% of public non-Smith voters did. Meanwhile, 43.2% of public Hoffman voters voted for Smith, while just 9.6% of public non-Hoffman voters did. That's as strong a correlation as you'll find this side of Bonds and Roger Clemens.
Rather than devise an adjustment factor for Hoffman, I calculate his estimated share of private ballots directly under the assumption that the same percentages of Smith voters and of non-Smith voters support Hoffman across the board. For instance, as of January 2, Smith's estimated performance on private ballots was 43.4%; 88.4% of those 43.4% are assumed to support Hoffman, and 51.5% of the remaining 56.6% do too—yielding a private-ballot haul of 67.5% for Hoffman. Once his private performance is calculated, I combine it with his public votes in the same way as every other candidate to get a final projection.
I follow the same logic with Wagner. The lefty is an odd case; he is a relief pitcher, which means it is hard to get new-age voters to take his candidacy seriously—but his Hall of Fame case, less reliant on saves and more on strikeouts and win probability added, is best appreciated with the use of advanced metrics. That leaves him in no-man's land; in political terms, he lacks a true base. He does not correlate especially well with Hoffman and Smith voters, who disdain his smaller number of saves. But he also does not correlate with stathead favorites like Schilling or Jeff Bagwell. It turns out that his closest relationship is an inverse one with three misfit sluggers: Mark McGwire, Sheffield, and Sammy Sosa. In fact, it's a perfect relationship: so far in public ballots, no one who has voted for one of these three men has also voted for Wagner. Because they are the largest sample size, I used McGwire voters and non-McGwire voters to calculate Wagner's popularity on private ballots, but the results are very similar if you use Sheffield or Sosa instead.
We come to a problem when we try to apply this technique to Griffey and Edmonds, however. So far, Griffey is the unanimous choice of public ballots—so a linkage analysis is useless. One hundred percent of everybody's voters voted for Griffey, and there are no non-Griffey voters whose preferences to survey. Edmonds, meanwhile, has the opposite problem—he has scrounged up just four public votes so far, not a big enough sample size to extrapolate from. For these two, then, I had to calculate adjustment factors a different way: by looking back at past candidates to find a good analog.
Griffey is a well-liked, PED-free superstar with an undeniable statistical case for the Hall. How have those types of first-time candidates fared in the last few years? Randy Johnson slipped −2.0% from public to private. Pedro Martínez lost −10.4%. John Smoltz's factor was −5.5%. Greg Maddux's was −3.6%, Tom Glavine's was −5.9%, and Frank Thomas was −8.3%. In 2013, his first year on the ballot, Craig Biggio went down −2.9%. Clearly, there are some voters who are simply ignorant, contrarian, or both. They cause Griffey-esque candidates to fall an average −5.5%—so that's his adjustment factor.
Edmonds, meanwhile, was a hard-nosed competitor with a gaudy reputation but stats that are much less obviously Hall-worthy. So far, the electorate appears to be treating him similarly to the glut of borderline candidates swimming against the tide to avoid the 5% elimination threshold. In their first years on the ballot, comparable players Garciaparra (+5.6%), Carlos Delgado (+3.7%), Sheffield (+3.0%), Jeff Kent (−0.9%), and Sosa (−1.4%) mostly gained votes from private balloters. I took their average of +2.0% as Edmonds's adjustment factor.
Finally, I've judged Garret Anderson (who, yes, has actually gotten one public vote so far), Jason Kendall, and the other clear long shots to be non-competitive—that is, they won't get more than a handful of throwaway votes, as happens every year for the ballot's weakest links. (Aaron Boone got two votes last year!) Because there's little point in trying to predict whether Mike Lowell will get one vote or two, everyone in this category simply has an adjustment factor of zero.
These methods have served me well in the past. But this year is an exceptional one because of the unknown effect of the purge of the voter rolls. How much of a wrench will it throw into my work? We already know that the purge will drastically change turnout this year, and estimating turnout accurately is important to my methodology. The initial guess of 450 seems logical, but really we have no idea—and we don't have precedent to back that estimate up. It's also very possible that the purge fundamentally altered the difference between public and private voters. Many have speculated that conservative private voters were disproportionately purged—the retired baseball writers who no longer have a newspaper column (or are too old-fashioned to have a Twitter account) with which to share their ballots. This could mean that the remaining stock of private ballots resembles the public snapshot a lot more closely than in years past. On the other hand, we know of plenty conservative, formerly public voters who were purged this year. Maybe the purge affected both public and private equally, and the adjustment factors will largely hold up.
I admit I'm unsure. But I know better than to construct a new methodology based entirely on theoretical suppositions rather than one that has been proven accurate in the past. I'm bracing myself for unpredictability, and, once known, the effects of the purge may cause me to tweak my methodology next year. But for now, I'm sticking with what I know.
Labels:
Baseball,
Hall of Fame,
Number-Crunching,
Predictions
Subscribe to:
Posts (Atom)