Wednesday, April 02, 2008


I was delighted by this study of the baseball manager statistics rendered as faces in yesterday's Science Times. It's a demented project:
While reams of categorical data can be imposing and hard to parse, translating the differences among them into facial characteristics can communicate distinctions with striking clarity. By turning rates of bunting, stealing and pinch-hitting into hair sizes, nose shapes and smile widths, Dr. Wang used a kind of statistical Mr. Potato Head to portray the spectrum of managerial characteristics in a way that intrigued even the skippers themselves.

The picture of Tony La Russa is particularly striking. Terry Francona looks very much like the league average. The article notes that you can see resemblances between managers and their proteges; hence Willie Randolph resembles Joe Torre to some extent. What I want to see is a comparison of how those faces change from season to season: Willie Randolph's wide smile was certainly a function of Jose Reyes' stolen base attempts last year (and thus it appears frozen in time even after the September collapse). How much do these habits being measured change over time given who's playing and who's having a good year? In that way, some of the resemblances between proteges might not be as striking in some years. Anyway, it's a charming project.

I also had a good time reading the alternative history of baseball article from Sunday's paper in which two statisticians modeled 10,000 ways that the seasons could have turned out differently to see where Joe DiMaggio's 56-game hitting streak fit in:
In a fit of scientific skepticism, we decided to calculate how unlikely Joltin’ Joe’s achievement really was. Using a comprehensive collection of baseball statistics from 1871 to 2005, we simulated the entire history of baseball 10,000 times in a computer. In essence, we programmed the computer to construct an enormous set of parallel baseball universes, all with the same players but subject to the vagaries of chance in each one.

Here’s how it works. Think of baseball players’ performances at bat as being like coin tosses. Hitting streaks are like runs of many heads in a row. Suppose a hypothetical player named Joe Coin had a 50-50 chance of getting at least one hit per game, and suppose that he played 154 games during the 1941 season. We could learn something about Coin’s chances of having a 56-game hitting streak in 1941 by flipping a real coin 154 times, recording the series of heads and tails, and observing what his longest streak of heads happened to be.

Our simulations did something very much like this, except instead of a coin, we used random numbers generated by a computer. Also, instead of assuming that a player has a 50 percent chance of hitting successfully in each game, we used baseball statistics to calculate each player’s odds, as determined by his actual batting performance in a given year.

Arbesman and Strogatz determined that there were all sorts of streaks possible in these alternative histories. It's an utterly charming article, and I was struck by the two letters which were published about the study: which both insisted on shrinking the scale of analysis down to the single season or the individual (and actually the first letter writer asks if zooming in to the level of DNA would be more or less appropriate).

Such objections aren't new or surprising, but it highlighted for me the ways in which people may have a hard time comparing the scales of research analysis when they're responding to weird projects like this one. These letter writers are advocating units of analysis (the individual, the single season, the day of the streak, the weather or park conditions on that day) which the statisticians performing this project have zoomed out from because they want to work on a larger scale, which allows them to make different types of speculations than those who are interested in those smaller scales are able to do. If one considers the claims made in biographies of baseball players which may make claims about individual psychology or a personal superstition, these claims are just as untestable (or unbelievable), but most people are more used to conceptualizing the human scale rather than the scale in which there are 10,000 alternative scenarios. I feel like imaginative work on this latter scale is more accessible to chess or poker players, who are more used to conceiving of larger sets of alternative choices or outcomes. How does fantasy baseball allow people to move among these different levels (assessing player productivity over several seasons, comparison to other players, picking single team members, etc.)?

Labels: , , ,

Blogger Michael Mirer on Wed Apr 09, 03:43:00 PM:
Gould's piece suggests that for a streak like DiMaggio's to ever be statistically likely again MLB would have to have multiple career .400 and .350 hitters. So this finding seems way different. We're outside of my expertise here, but I have a hard time believing that all the parallel baseball universes would have been that much different from the actual baseball universe.
Blogger Ben on Wed Apr 09, 09:56:00 PM:
The piece is a classic example of a headline directing the interpretation of a story. What's remarkable about the streak is not that it happened ever, but that it happened so late, at a time when baseball was so competitive that one player couldn't be so much better than everyone else. It looks like it's overwhelmingly, hugely unlikely that it should happen post-Ruth -- from their charts, something like 95% unlikely. In other words, after another 50 or 60 years of baseball, there's only a likelihood in the ballpark of 5% -- higher than people normally assume, but still pretty small -- that anyone will beat DiMaggio's record.