These ponderings become relevant every year around this time when America votes on whom to send to the All-Star Game. Especially for the Final Vote, a strong regional base of support is as important to winning an All-Star berth as a GOP primary. MLB's county-by-county results maps confirm that geography is destiny; the candidates with the ties to the nation's most populous areas always seem to win. Is it really any wonder that this year's Final Vote winners both hail from Chicagoland, or that last year's had the Deep South and an entire country to themselves?
This got me wondering—what are most powerful voting blocs in baseball? Which teams own the most turf, or hold sway over the most people? We know this generally, of course, thanks to measures like the Harris poll. But to my knowledge, no one has ever undertaken the ambitious project to quantify how many millions of fans each team has. What team has the largest fan base—and how large is it?
Ultimately, the question is unanswerable—at least until the U.S. Census starts asking about baseball fandom. But we can approximate using two data sets that I know are out there: Facebook data and polling data.
Facebook data simply means people who have "liked" a given Major League team on Facebook. Facebook provided this data to the New York Times, where the Upshot created a amazing tool showing fandom by geography. Their map includes the top three teams, by percentage of MLB team "likes," by county and even down to zipcode. It's a beautiful and rich data set, but it's not perfect for our purposes.
- The public-facing data posted on the Times website, at least, only provides the top three teams for each geography, leaving potentially hundreds of thousands of fans uncounted.
- The map doesn't tell us how many baseball fans live in each county or zipcode—just the percentage of total baseball fans there that swear allegiance to X team and Y team.
- An easy solution would be to multiply these percentages by each county or zipcode's population. That would assume everyone in the country is a baseball fan, however. As much as this should be true, it sadly isn't.
- We could always scale the population figures down to 37% (a.k.a. the percentage of adults who told Harris that they are baseball fans). However, not all counties are created equal. Suffolk County in Massachusetts is about the same size as Oklahoma County in Oklahoma, but there are almost certainly more baseball fans in Suffolk. In short, deriving absolute numbers from the Upshot map requires a healthy dose of speculation.
- In addition, Facebook data can be unreliable. Not everyone is on Facebook—it might skew to a younger demographic. People also don't always "like" the things they like, and many people will "like" a zillion things that they don't even like all that much.
- Finally, and most importantly for our purposes, it would just be too damn hard to apply the data to this project. There are 3,141 counties in the United States, and it would take forever to manually multiply each county's Facebook data from the Upshot map by its population. I've contacted the Times to see if they have the data in exportable form but have yet to hear back.
|Boston Red Sox||9,694,711|
|New York Yankees||8,062,618|
|St. Louis Cardinals||5,023,469|
|San Francisco Giants||4,836,995|
|Los Angeles Dodgers||3,334,120|
|Los Angeles Angels||2,828,442|
|Kansas City Royals||2,204,291|
|Chicago White Sox||2,088,561|
|San Diego Padres||1,459,699|
|Tampa Bay Rays||1,173,900|
|New York Mets||643,554|
|Toronto Blue Jays||0|
Take these numbers for what they are—an incomplete answer to an unanswerable question, and a project that will always be a work in progress. Obviously, 18 states remain unpolled as to their MLB preferences, including rather important ones like Georgia and New York. They will seriously alter the numbers above, such as boosting the Mets' abysmal total and, most likely, launching the Yankees and Braves past the Red Sox and Cubs into a battle for America's most electorally powerful fan base. The 32 states we have polling data for are shaded in black:
As you can see, the Mariners, Orioles, Yankees, Mets, and Braves are going to be underrepresented in the current numbers. The Red Sox and Nationals probably are too, given the absence of several New England states and two-thirds of the DMV. The Phillies are similarly probably feeling the non-inclusion of New Jersey and Delaware. However, it's bad news for teams like the Rays, Marlins, Brewers, A's, Padres, and others—there are not a lot of places left for them to accumulate more fans.
Some other flaws with the polling approach:
- Polls look only at people who are registered to vote (and, in this case, at only people who DID vote in Mississippi, Virginia, and West Virginia—even smaller universes). This is a majority of people in the country but by no means all of them. Millions of people not registered to vote are likely baseball fans, and their preferences not only aren't included, but we also wouldn't know how to extrapolate them. It's very possible that, because of the kind of person who registers to vote vs. doesn't, their tastes are materially different from those polled.
- These polls only survey voters in the United States. This means foreign fans go uncounted, including—crucially—Canadian fans of the Toronto Blue Jays. (This is also a problem with the Upshot's baseball map—which limits itself to the US—although not necessarily with Facebook data inherently.)
- A poll can only provide eight or so possible choices to the question, "What is your favorite MLB team?" before the question becomes too long and loses people. That means some fans' preferences won't be counted, although it's not as bad as the three-team limit in the Facebook data. Using the necessary discretion in choosing which teams to ask about also runs the risk of missing, say, a hidden pocket of Orioles fans in Minnesota.
- Conversely, PPP almost always asks about the four teams with major national fan bases: the Cubs, Red Sox, Yankees, and Braves. This explains why they are so far ahead; if we were able to ask about all 30 teams, each one would gain a sprinkling of a few thousand to a few tens of thousands of fans in each state—enough to add up. Put another way, we really don't know what to do with the "Unknown/Non-Fan" group.
- One advantage to polling vs. Facebook is that it lets people say they don't have a favorite baseball team, presumably because they are not fans of the sport. However, very few people actually take this option in the survey—certainly less than the 63% we would expect to. Therefore, these polls are probably counting people as fans who are only casual partisans or don't even care about baseball. A Bostonian may consider themselves pro-Red Sox even if he doesn't really care for the sport because, hey, why shouldn't I want them to do well? There's thus a concern that these numbers are higher across the board than the real number of "true" fans.