Monday, October 29, 2012

In Defense of Nate Silver

It's Nate Silver's job to analyze the news—so it must have come as quite a shock to him today to find himself become the news. While criticism of Silver has been out there for a long time, its most recent form has cut straight at the heart of Silver's analysis and represents the same type of anti-intellectual fear that has followed trailblazers like him around for centuries. In a surprisingly acidic POLITICO article, Dylan Byers makes Joe Scarborough's case against Silver and his data-driven polling analyses:
"So should Mitt Romney win on Nov. 6, it's difficult to see how people can continue to put faith in the predictions of someone who has never given that candidate anything higher than a 41 percent chance of winning (way back on June 2) and — one week from the election — gives him a one-in-four chance, even as the polls have him almost neck-and-neck with the incumbent."
Critiques of this ilk betray an inability to even speak intelligently on the subject of statistics, let alone a leg to stand on when presenting a counterargument to the findings of Silver's trademark Electoral College–predicting model. As I write this, Silver and his model give Barack Obama a 74.6% chance of victory on November 6. That number is very prominently labeled on Silver's website as "chance of winning." There's not much ambiguity in that. It should be obvious to anyone looking at that figure that what that means it that, in the judgment of the model, Mitt Romney has a 25.4% chance of winning the presidency.

But Byers, in the passage quoted above, clearly misses that point. Nowhere does it say that those 74.6%-to-25.4% figures are a prediction that Obama will win or that Romney will lose. It is an attempt to take a snapshot of the data and figure out odds. As Silver told Byers in the POLITICO article, there is still a significant chance that Romney wins—indeed, specifically, a one-in-four chance. If Romney wins, the model was not necessarily wrong. Indeed, every fourth time the model was run (Silver runs 10,001 simulations per day), Romney did win—and it's not a contradiction to say so while still handicapping Obama as the favorite.

Scarborough and, apparently, Byers seem to have a problem with this, but they don't seem to understand that this is the scientifically responsible way of doing this sort of thing. There is an academic discipline known as statistics, and they've been doing this a whole lot longer than any of us. Silver and others trained in this fickle art adhere to time-tested tactics such as the scientific method, gathering as-large-as-possible sample sizes, and acknowledging and even embracing the possibility of error.

In a world of post-debate insta-polls and Senate race rankings that are either Lean Democrat or Lean Republican, we as a society place a huge emphasis on "calling" states, elections, World Series, you name it. Audiences want instant gratification, and pundits give it to them with iron-clad predictions that they finalize and stick to come hell or high water.

What makes Nate Silver so unique—and so valuable—is that he resists that entirely (and yet still manages to be popular; imagine that!), favoring instead a scientifically responsible spectrum. The core tenet of this method lies in the difference between a 49% chance of an Obama win and a 51% chance of an Obama win. For most pundits, those are opposite predictions. On a spectrum, they're virtually identical. Given that it only takes a two-percentage-point swing to make up that difference, that's the right way to think about it.

Likewise, a spectrum always leaves room for some doubt. Even very safe predictions have a small chance of not happening, and a probability spectrum is honest about that fact, setting 99% or 99.9% odds for a very likely event. In other words, a good scientist always leaves room for the possibility that anything from extreme X to extreme Y will occur; the trick of creating a utile spectrum is knowing where to fix the "tipping point" between "lean X" and "lean Y," not picking one or the other. The beauty of a good probability spectrum is that it allows for every possibility. That's because the chance always exists, however small, that something extremely unlikely (e.g., a Romney landslide) will happen. In that sense, spectra like Silver's model will always be accurate.

And maybe that's the problem; skeptics see Silver's model's tolerant spectrum as wishy-washy—an attempt to take credit for being accurate no matter what the outcome. Science has one word for these people: "Tough." We have no choice but to accept this little ambiguity in our lives, because we have no way of ever being certain about anything. I understand that that is unsettling for many people, but that's what being a scientist—or even just being intellectually curious—is all about.

It also doesn't help matters that the exact figures and contours of a probability spectrum are impossible to prove. No one can ever say for sure that, on October 29, Barack Obama had a 74.6% chance of winning the race, even if Romney does win in that landslide. All we will know is the binary outcome: did Obama win or not, and by how much. It takes a much broader body of work to "prove" (to the extent anything can be proved) that those odds were correct—a body of work that, sadly, we'll never have. (You'd need the 2012 election to duplicate itself in future elections exactly the same way through October 29 a few hundred times, then see who won in each of those cases. In laboratories, these types of experiments are possible. Not in political science, where this is only the 57th presidential election in American history.) The best Silver—or anyone mortal in the whole wide world—can do is make an educated guess based on the data we do have. You may criticize which data—which polls or which economic variables—get plugged into Silver's model; you may not ignore science or the discipline of statistics.

Yet people do. People rely on their "gut" more than on the data in many more fields than just politics, and Silver has been dealing with them his whole life. As an early employee of Baseball Prospectus, Silver invented the PECOTA system and was an early figure in baseball's sabermetrics. He and co-Moneyball-ers tried to bring a rational, data-driven approach to predicting baseball the same way he has done in politics—and met with the same uninformed ridicule.

Baseball is full of the same "anti-statheads" that have come out of the woodwork in politics recently. You know them as the people who think of pitchers' wins as still a valuable statistic. They're the ones that denigrate WAR by saying that a better measurement of skill is actually how many wins you generate above a replacement level. They believe in momentum in baseball, in "clutch" hitting, and in the idea of lineup protection just because their experiences have led them to.

(Note: I'm painting with an extremely broad brush. In fact, I would like to see more rigorous statistical study on each of those last three. And you can indeed have a reasonable argument with other baseball experts or fans about those things—as long as the argument is empirical and grounded in data and facts, not "general impressions.")

The anti-Silverites in politics we see today are the descendants of the meanest versions of that baseball old guard: the old-timey scout who believes stats and innovation have nothing to offer him; the longtime columnist who bullies and mocks statisticians as "eggheads" or "binder boys." These people are as closed-minded as Nate's probability spectrum strives to be open-minded. The best analysis, and the best predictions, will inevitably come from viewing all available data and considering them holistically. As HardballTalk head blogger Craig Calcaterra says, quite astutely I think, if you worked in any field other than baseball and stubbornly ignored new information and new technology in your job, you'd be fired. Any field other than baseball or politics, I guess.

Maybe it's my wishful thinking, but it seems to me that those people in baseball are, fortunately, becoming more and more marginalized. Unfortunately, though, that's what makes those critics in politics much more dangerous—they are actually "important" pundits who are taken seriously. Indeed, in baseball, people can be as ignorant as they like, but the only real damage they're doing is taking up column inches and maybe, just maybe, encouraging a stupid trade to go down. In the powerful field of politics, ignorance can have a real effect on policies or the next leaders of the United States. They're playing with fire.

That's all the more reason to make sure Silver's voice of reason isn't drowned out. Unfortunately, Nate has caught onto the fact that many people in politics are jerks, and it may hasten his "retirement" from the field of political forecasting. (This most recent incident can't have helped.) But with Nate gone, unlike in baseball, the statistics-ignorant crowd will have won out, and political observers and the viewing public will go on thinking that the tools of ignorance are an acceptable forecasting model for elections.

I happen to roughly agree with Nate's prediction on the outcome of the presidential race, but losing his crystal ball is not even close to the reason his departure would sting. Rather, it's the loss of the reasonable, data-driven approach that he represents and has brought to the fields of baseball, politics, and others. Silver stresses that prediction is an imperfect science that he's just trying to make sense of, not solve. He is a student of the science of prediction, not a prognosticator per se. As he likes to say, it's not about which predictions are right, but rather which are less wrong. Instead of trying to eliminate error like those pundits and their iron-clad predictions, he truly does embrace it and see its value in helping improve subsequent forecasts. He's enhancing the study of predictions and helping us understand how to make them better—a goal that's bigger than elections, in some cases even helping to save lives.

Anyone who has given Silver's New York Times blog a more than cursory read recognizes all this, because Nate goes to pains to point it out (undoubtedly stung by ignorant would-be statisticians before). This isn't weakness, or being "unmanly," as it was distastefully put this week. It's realism and nuance, two traits that are essential for level-headed people in any field—only when it comes to predictors, they're basic qualifications.

No comments:

Post a Comment