Archive for the ‘Sabermetrics’ Category

The Hardball Times 2008

Thursday, February 7th, 2008

I finally picked up The Hardball Times Annual 2008 in early January and I breezed through the analysis section in a night or two. I’m writing this now because I wasn’t planning on writing a “review” of it - my writing style, if you will, isn’t one that’s able to sum up a fine, 350 page book or add any additional insights in a short essay. That said, I figured I’d jot down some of my thoughts here on some of the articles, now that I have this site/blog. There probably isn’t much here for anyone who’s already read the book or has even rudimentary knowledge of sabermetrics, but I figured maybe I could coax a few newbies into giving the book a chance. By the way, you can pick it up on HBT’s site or, of course, through Amazon.

Anyway, I was most interested in was the analysis section. Tom Tango has a couple of back-to-back articles using his “with or without you” concept. Basically, there are certain parts of the game that have been tough to evaluate with the available data - first basemen saving bad throws, catcher’s fielding, and so on. Something like passed balls for a catcher is heavily dependent on the pitcher out on the mound, so you can’t just look at raw totals in evaluating a catcher’s performance in this area. As the title (With or Without You) implies, he looks at how a catcher does with a certain pitcher and compares him to all other catchers who caught that pitcher. And then he does this with all possible combinations to get an overall +/- past balls (or whatever the stat in question is). Tango’s next article uses a similar methodology to evaluate fielding, and specifically Derek Jeter. Let’s use a quick example (I’m just making this up):

Pedro Martinez on the mound
Nomar, when on the field, converts 12% of balls in play into outs
The rest of Pedro’s shortstops convert 11.6% of balls in play into outs

So Nomar is better than his peers with Pedro on the mound. Now, with one combination this might not be that valuable, but add them all up and you have quite a few comparisons. He also breaks it down by park, base configuration, etc. (all at Jeter’s request ; ) A must read, indeed.

John Walsh has a great piece entitled “The Origin of the Platoon Advantage.” In the second part of the article, Walsh utilizes the PITCHf/x data, which has been a tremendous boon to many researchers over the past year. Anyway, he uses this new pitch tracking technology to classify pitches and then look at platoon splits for each pitch. For instance, when the batter has the platoon advantage on a slider, his average on balls in play is .340. When the pitcher has the advantage … it’s .304. I’ll let you read the article for all the gory details and the conclusions, but it’s certainly an interesting study and a great area for future research.

Another article I’ll briefly talk about is Greg Rybarczyk’s “Of Home Runs and Free Agents.” The first half of the article is a detailed report on all the home runs of 2007, using Greg’s (yes, I think we’ll just go with Greg) Hit Tracker system, which models the flight of a baseball to find each homer’s “true” distance. The second part, however, is even more interesting, at least to me. He now uses Hit Tracker to not only analyze home runs, but all balls in play (in this case, for Torii Hunter and Andruw Jones). On one table you can see the relationship between BABiP and speed off bat. As you might expect, a higher SOB, in general, produces a higher BABiP. This type of stuff - the PITCHf/x data, Hit Tracker, etc. - is really the future of sabermetrics. If you could model every ball hit in play by a hitter (which he does), you could take out the fielding aspects, the parks, and so on, to get a much clearer picture of how a hitter performed in a given sample of games. It could lead to knowledge of when to play a shift, when a player may be injured, when a hitter got unlucky, and numerous other things.

Just so it doesn’t appear like I’m looking at things through rose colored glasses, I was not particularly impressed with Bill James’ article. For being perhaps most influential sabermetrician ever, James certainly takes enough crap — and I certainly don’t feel like someone who is entitled to dish any out. But his article on clutch hitting is simply not up to the high standards of THT or Bill James (there just isn’t much to it). That said, in his 7 elements of clutch, he does bring up at least 3 interesting, largely unexplored topics (at least to my knowledge) – the opposition, the standings, and the calendar. Stats like WPA (and fangraphs’ clutchiness) have thus far omitted things like that, and there’s certainly lots of room for research there. Take two homers — both are walk off grandslams in the same exact situation (base, score, etc.). However, one is hit in september with the team two games down in the standings. The other is in July for a team 15 games back. Which one is more clutch?

Anyway, this is just 4 or 5 of the 10 analysis essays. There are also sections on history, the 2007 season, and more commentary from some of the best bloggers/writers on the ‘net … a bunch of stuff I haven’t even read yet. Also, the back of the book is filled with statistics, both basic and advanced. They have batted ball statistics on all players, stats like GPA, FIP, RZR, as well as month by month team stats (and a million other things).

I don’t know exactly what all of the old annuals used to look like. I’ve picked up BP’s for the past few years, and this is certainly right there with it, if not a couple of notches better (of course, depending on what you’re looking for). My opinion doesn’t mean much, but I certainly recommend it for anybody who happens to be new to sabermetrics. Even if you don’t think you’ll like sabermetrics or this book, give it a try. When you boil it down, it’s just baseball through a different lens.

Sabermetric wiki

Thursday, January 31st, 2008

Tango Tiger has created a wiki page for all things sabermetrics. If you know what you’re doing, go make some pages … that’s the point of the wiki setup, I believe. If not, bookmark the site and go back often when you have a question about something. It could certainly become an invaluable resouce, and it has the advantage of having everything all on one page (on a specific subject), rather than just a bunch of various links as I’ve tried to compile here.

Home runs and attendance

Thursday, January 31st, 2008

There are numerous places that do a much (read: MUCH) better job reviewing and discussing various research that is published daily than I ever could (like, for instance, The Book Blog, Baseball Think Factory, Sabermetric Research, etc.). With that said, I figured since the main idea of this site is to link to other research, I may as well (on occasion) keep up with some of the recent stuff here on the blog. Remember, while the rest of this site is generally void from my opinion (thankfully for you),  the blog will not be. That’s not to imply that my opinion is important — it definitely isn’t.

And yes, in a technical sense, everything linked here is a product of my opinion — but you now what I mean.

Anyway, with that out of the way, David Gassko looks at home runs and attendance over at HBT. He finds that since 1970 both home runs and attendance have gone up (correlation coefficient  = .39). Of course, he also correctly notes that correlation does not imply causation. The relationship between attendance and home runs could be mostly coincidental. Take, for instance, the fact that hit by pitch has gradually gone up since 1970. Let’s hypothetically say the correlation coefficient between attendance and HBP is .70. Surely, that’s a pretty good relationship, but it’s most likely not important (i.e., probably coincidental).

So, Gassko goes on to figure that the issue needs to be looked at from a team by team level. Furthermore, you have to account for different variables that may cloud the results - like winning teams (often times teams that hit many homers) drawing high attendance figures because of the … wins (and, thus, not necessarily because of the home runs). He finds that a home run adds an extra ~2,000 fans and that it’s significant at the .001 level.

The implications are interesting. If two players are essentially equal value-wise, you’d probably want to take the one with  more home runs because of the possibility of increased attendance. However, if you’re going to look beyond actual value on the field, I think you then have to consider other things like fan appeal, community involvement, and things of that nature (which I’m sure teams often times do consider). Still, since we already know that extra home runs add to attendance, it is particularly valuable. “Fan appeal” is nice, but it may not actually lead to more fans attending ball games (or more profit). I suppose, that’s for another study.

Goldmine

Thursday, January 24th, 2008

No, I’m not talking about the upcoming Bill James book — I’m talking about Tango’s old blog on primer that he’s archived on his site. For someone who didn’t start reading this stuff until 2004(ish) – and didn’t really find Baseball Primer until it was Baseball Think Factory – a lot of it is still new to me. Anyway, I’ve stumbled across these archives a time or two, but never really found my way back, for whatever reason.

I believe something happened with Primer and the links broke, but Tango has his blog and all the discussion on his site, tangotiger.net (check the link above). There are something like 350 archived posts there with many comments from some of the titans of sabermetrics. For someone who is new to sabermetrics, I’m going to guess it’s an essential starting place, perhaps after you’ve been introduced to some basic concepts. For someone like myself, who is still relatively new to sabermetrics and often doesn’t understand what the heck’s going on (and also surfs random baseball blogs at like 3 in the morning), it’s truly a goldmine of information and great discussion.

In terms of incorporating things like this in to the archives I’m trying to make here — well, that’s kind of difficult. I’ll certainly try to extract some of the best threads, but there’s no practical way of tracking down every relevant word that’s been written about sabermetrics over the past 15 or so years (and even beyond that, in print, of course). At some point, the quantity of links would hurt the quality of the site, as anyone would likely become overwhelmed. Anyway, on that note, I’m still very interested in any suggestions you may have, if you happen to stumble upon this place. For now, though, I’ve got some reading to do.