WFNY Podcast – 2013-01-31 – The Art Modell Episode
January 31, 2013Now an Angel, Omar Vizquel shares favorite Indians memories
January 31, 2013Yesterday I tweeted out a mercurial link about my “nerd-project”, but today I wanted to share it with you all more formally.
Before that, a warning: this has almost nothing to do with Cleveland Sports, so turn back now if that’s the particular tunnel you like looking though.
Basically, this thing started when I was writing a piece a few weeks back. I got tired of looking up stats on several different sites to get all the data I wanted. More often than not, I was toggling between a player’s Baseball-Reference page and his corresponding Fangraphs page. Sometimes, I was even averaging the two sites’ similar (though slightly different) version of wins above replacement (WAR) to get a fuller picture of the player’s value. This became tiresome.
So I decided to create a tool that would average the two versions of WAR for me. I’d rip down a bunch of data and combine it on a locally installed version of MySQL, and I could have the tool to give me a quick and dirty version of a combined wins above replacement. Since Fangraphs’ version is called fWAR and B-R’s version is called rWAR, I decided to call mine “jWAR”. Get it? The “j” stands, quite obviously, for “judicious and well-informed decision to combine other previously discussed versions”. Or “Jon”. Either way.
But of course I couldn’t stop there. Baseball-Reference had cool stats like OPS+ and ERA+, which Fangraphs didn’t house. And Fangraphs has one of my all-time favorite hitting stats called “weighted on-base average” (wOBA) that Baseball-Reference doesn’t have. If I was going to combine all this data, why not go all the way?
And if you’re going to go all the way, why not throw some pretty graphs and tables to help you easily visualize it all? A no-brainer, right? At least that’s what I told my wife as I buried myself in my databases last weekend.
Anyway, that led me to create a fun little dashboard. And since I did all that work, I thought I might as well share it with you guys. Partially because I’ll be using this tool going forward, and partially in case any of you would ever find something like this to be useful in your own research. There’s really nothing new here, but I really enjoyed having all this data in one place. Maybe you will too.
Some ground rules:
1) Both fWAR and rWAR include a player’s “position player” component as well as his “pitching component”. For non-pitchers, this doesn’t matter (unless you count Dandy Marte as a “non-pitcher”). For pitchers, it means they get their oWAR from their at bats (which will be tiny, but still, I wanted it this way) as well as their normal pitching component. Please keep this in mind.
2) I included only the years 2008-2012. Why? Because that seemed like how far back I’d regularly want to look at this stuff. You want me to go back further? I want a unicorn farm.
3) That means that ANY PLAYER who had an at bat or threw a pitch during that timeframe should be in this file.
4) I’ve locked the structure of the workbook with a password.
5) You’ll need Excel 2007 or later to use the file; it’s about 5.5 MB download.
6) We’ve stored it (for right now at least) at the bottom of the “ABOUT” section of the website, in case you ever need to get back to it.
7) Feel free to do some QA. I’ve yet to find any big problems, but I’m not suggesting it is perfect by any means. When you find mistakes, shoot them to me at jon (at) waitingfornextyear (dot) com and I’ll check it out.
So have fun and let me know what you think.
29 Comments
Regarding Nick Swisher… Why does fWAR consistently value him higher and what is so different about the calculations that lead to over a 2 win difference in 2011?
I haven’t looked particularly at Swisher’s case, but for most position players, the difference between the two systems will be largely how they calculate the defensive component. For whatever reason, UZR (used by fangraphs) rates Swisher cosiderably better defenisively than Dewan’s Defensive Runs Saved (used by B-R). That will typically be the difference, and that’s why I like averaging the two, since both systems are prone to small sample size issues.
For pitchers, the difference will be more pronounced. Fangraphs basically uses FIP (so HR, Ks, BBs) whereas B-R uses a R/9 with a defensive and park adjustment. Those approaches are vastly different, and again, I think looking at both can be instructive.
When I open it, why does it ask for my full name, date of birth, address, social security number, mothers maiden name and name of first pet?
Good stuff Jon. Can I negotiate your employment contract when Billy Beane comes calling?
it is the Prince of Nabujari. do not worry, he will keep your secrets safe and give you $10mil once he is able to free his assets for government control.
I love how Brantley is the name that initially popped up upon opening the XLS with his steep upwards curve on full display!
Jon, thank you. I also usually have B-R and FG open and go back and forth between them. It’s going to take some explaining when I start using jWAR in common conversation though 🙂
Not an accident :). We’re optimists around here.
Wish we could have gotten Stanton this offseason. Neat tool you created. Would be nice to be able to filter by team, or sort by last name instead of first name.
Be honest, you wet yourself when you read this didn’t you? 😉
This is a stat geeks wet dream!
Oh man I could have POSTERIZED you big time but I gave you a pass. You are welcome!
TD’s brainwashing no doubt.
as you can see, Raburn is off the charts – THE WRONG WAY!
that wording was carefully constructed. that is all.
The innuendos abound!
Thanks.
Now, about Russell Branyan… how does he not get a PA last year after OPS+’s of 373 and 252 in 2010 and ’11, respectively. Of course, B-R shows OPS+’s of 124 and 84 in this two years, so maybe there’s a mistake in your calculation. 🙂
(And, yes, I welcome all the heat I will get for looking at Branyan’s page, but, what can I say? I have a healthy sense of humor.)
Yep. This did it. All of the fun has officially been sucked out of baseball. I feel like I just spent 2 hours in JoAnn Fabrics with my mom. Need . . . candy . . .
Hmm. My version has it correct (at the 124 and 84 that you cite). I had to break some database connections before uploading to the site, but that really shouldn’t have affected it. Is everyone else getting these numbers for the Love Muscle as well?
I almost added “team” as an attribute to this, but you can imagine that over a five year sample that would’ve been difficult to do, since guys change teams all the time. (interestingly (or not) this made the BR stats much harder to pull, since they break up players who were traded mid-year into TeamA, TeamB, and TOT)
As for the names, I just used them the way they came out of the respective sites.
Which is why, as has been pointed out to me already, I’m having some trouble with suffixes. Both Ken Griffey Jr and Gary Mathews Jr have som issues. I’ll work on it when I get a chance.
Calling it an average of B-R and FG (which is what it really is) would probably be the best way (for now). It will probably have a different name if some national schmuck gets a hold of it, and would still spit on Cleveland (some have it as a favorite hobby like ESPN). As a fellow stats geek, I gotta try this out.
He’s no Brent Lillibridge that’s for sure!
Jon, I do have a bit of an issue: How would you categorize players on the line? Example: jWAR for Asdrubal Cabrera was 3.0 for 2012. You have it as average for some reason. IMO, it would fall under ‘good everyday player’.
I’m running Excel for a Mac and get a message that not everything may work, so maybe it’s on my end.
sure that’s the best way, but it’s more fun to just tell people jWAR and tell them it’s more comprehensive than either fWAR or rWAR (because by construction it is).
I get the correct numbers.
Also, I now believe that Mark Shapiro is a Mac user.
Those descriptors are really just made up lines in the sand anyway. I forget exactly how I set the conditional formats there, but I wouldn’t put too much stock in sticking too closely to them. To me, a 3 win player is an everyday starter on a decent team or a good player on a bad team. In Asdrubal’s case….well, you make a good argument for the second option there.
Either way, don’t go too overboard with that. After all, I’m pretty sure ‘crumtrulescent’ isn’t even a word 🙂
No, its really good man. Just had a brain fart on a few players since it has been so long since last season. So does it automatically update as well?
Ubaldo’s is quite depressing.