And the beat goes on. First, it was Ben Alamar, ESPN’s Director of Sports Analytics, who created quite the kerfuffle with his initial Kyrie Irving post on Dec. 22. Now, Jeremias Engelmann, the creator of ESPN’s Real Plus-Minus statistic, continued the lighting-rod topic with a follow-up post on Wednesday.
The two posts have centered on a main topic: That Irving, the soon-to-be 25-year-old Cleveland Cavaliers star, has plateaued since his historic rookie season and does not rank among the very top superstars in the NBA. Both articles from the former NBA front office employees relied upon that Real Plus-Minus statistic heavily featured on ESPN’s website. And because of the ongoing debate, WFNY felt it was appropriate to do a deeper dive into the metric and what it means.
RPM is another in a long line of statistics that try to capture the full value of players and everything they do in their sport. In the MLB community, we see things like WAR, wRC+ and more, whereas the NBA community has cycled through PER, VORP, WS/48 and many others. Each of these statistics has its own positives and negatives. But it seems necessary take a step back and look at how RPM came to be in the context of this ongoing Kyrie Irving debate.
According to the ESPN definition, Real-Plus Minus attempts to “isolate the unique plus-minus impact of each NBA player by adjusting for the effects of each teammate and opposing player. RPM estimates how many points each player adds or subtracts, on average, to his team’s net scoring margin for each 100 possessions played. The RPM model also yields separate ratings for the player’s impact on both ends of the court: offensive RPM (ORPM) and defensive RPM (DRPM).”
RPM was unveiled with great fanfare by ESPN in April 2014. It was designed mainly by Engelmann, who used to work for the Phoenix Suns and the University of Kansas men’s basketball team. The introductory post and additional assistance was provided by Steve Ilardi, a clinical psychologist at Kansas and a long-time collaborator with Engelmann. Because of the backing from The Worldwide Leader, RPM seems to have become the most “mainstream” of this more advanced breed of basketball metrics. There is a segment of writers who lean on it heavily in their analysis and when publishing pieces.
But there is also a large amount of pushback against RPM. The exact formula is not available to the public, which makes it difficult to understand and trust for many. While the formula itself is very confusing, the context in which it should be used can also get murky. Those who created RPM and use it frequently tell people not to use it as a ranking, but it is occasionally presented as such. On occasion, RPM contains enough outliers to question its overall validity.
WFNYer Jeff Nomina asked me to answer some questions about RPM from a fan’s perspective:
Jeff: So, Jakey Stats, RPM seems to be part of the language of the NBA these days. Do you find it to be a valuable tool when evaluating players?
Jacob: So, my first comment there would be that RPM has value. It represents the advanced nature with which we talk about statistics and basketball in 2016-17. But people, of course, often misrepresent the origin of the statistic, its purpose, and how best to discuss it.
I think RPM does a pretty good job, sort of like John Hollinger’s PER statistic, of spewing out the league’s top players as one might expect. LeBron James and Steph Curry have dominated the rankings year-over-year. Russell Westbrook and Chris Paul have been right there, too. This is always a great spot-check for any new metric and an immediate litmus test for one that shouldn’t be trusted at all.
I enjoy how RPM has equal offensive and defensive components. In any sport, defensive value can be incredibly tricky to identify. Heck, we always debate how offensive value falls into place between shot-creation, shooting efficiency, assists, etc. But because RPM is based on team differential when the player was on the court, I think it does a solidly good job at taking “value” to a new level and actually splitting the game in half as it should be.
RPM also can be good at finding hidden treasures whose values don’t show up in traditional box score counting stats. Jonathan Tjarks had a great post in spring 2014 about how RPM seemed to overrate Dallas Mavericks backup Jae Crowder. Many of the warnings from that post still ring true and I’ll get to them later. But in the nearly three years since, Crowder has emerged as a legitimately good player in the league and was rewarded with a five-year $35 million deal.
So, despite the known downsides that I’ll list out below, I’m not saying to ignore ESPN’s Real Plus Minus. It epitomizes the evolution of basketball analytics and the progress of statistics in sports. I think it still has value in the way we talk about the NBA today. I’d mostly just encourage a healthy dose of skepticism toward RPM – and any new all-knowing, all-powerful metric you see out there on the interwebs.
Jeff: We don’t know all of the ingredients that go into RPM, but do we know enough that you feel confident the numbers are providing the insight that is intended?
Jacob: For the most part, yes. For a great overview on Real-Plus Minus, I’d encourage you to read Alex Schuhman’s post from when the stat was unveiled in spring 2014. He detailed the evolution of analytical thinking from traditional plus-minus on the box score to basic Adjusted Plus Minus (APM) to Regularized Adjusted Plus Minus (RAPM) to ESPN’s final product. Many other sites had versions of these stats well before ESPN went all-out with its own edition. Schuhman explained in clear layman’s terms what the statistic actually represents and several common errors in writing about it.
Austin Clemens and Daniel Myers also had good explainers on the topic of plus-minus statistics. One topic they both covered was collinearity. This is the statistical term for when two predictive variables (i.e. the value of two players) are highly correlated. In basketball terms, this is the phenomenon for when the sample sizes of two players drastically overlap, such as two players playing together quite frequently. The term “Regularized” in RAPM typically identifies that the stat was created using ridge regression, which aims to minimize the potential errors from collinearity.
However, there still remains an occasional problem of small sample sizes when attempting to isolate the value of a single variable, especially when dealing with just one season of data (or less). This is why many RAPMs often are based on several seasons of data. And why almost all will have various box-score and/or age components to them, in order to balance out the small sample sizes. This transforms the stat to something more substantive about predicting future results, as opposed to just describing the past.
But alas, since 2014-15, ESPN’s RPM has focused solely on single-season data. This makes RPM solely a descriptive statistic about the season on hand. And when we’re dealing with a half-season of data or less, we’re probably just getting a lot of the box-score components skewing the data in a certain direction.
Another notable decision made by ESPN is not to present confidence intervals or error bars alongside its RPM leaderboard RPM. For the average consumer of basketball news and information, this perhaps makes sense for simplicity. But without knowing the potential error of the stat, it’s portrayed as if it were a simple formula (like a shooting percentage or PER), as opposed to a regression-based projection with a certain range of what we think might be accurate at a moment’s notice.
So when we have a full season’s worth of data – which, of course, we only have at the end of the season – then I think RPM is pretty good at doing its stated job. But it lacks powerful predictive value because it is focused solely on the single season at hand. And the ongoing sample size issue is too great for me to ignore when presented with the metric without any additional context or margin for error.
Jeff: Thanks for mathing all the math for those of us who maybe don’t understand how the sausage is made. It seems that RPM is useful in a sense that it can confirm something you’ve noticed with the eye test, but is likely not accurate enough to cite as a hard metric. Do you have any examples of RPM being wonky and maybe misrepresenting what is actually happening?
Jacob: Totally. And it goes hand in hand with my comments above about sample size and box score components. Let’s first take a look at the case of Trevor Ariza, the Houston Rockets forward (who would’ve been a Cavalier if LeBron James had not returned in July 2014). As of Wednesday night, Ariza had a +4.07 RPM (+2.23 on offense, +1.84 on defense). That looks pretty favorable in the entire league.
However, Ariza had played 1,303 of his 1,361 minutes during the 2016-17 season alongside top MVP candidate James Harden. So in Houston’s first 40 games, there was only a sample size of 58 Ariza minutes without Harden. Harden had only played 161 minutes without Ariza. So the success of the two players playing and playing together was obviously intertwined with the success of the entire Houston team.
During the previous two seasons, Ariza’s RPM was +1.58 and +0.57. Harden’s RPM has gone from +8.50 to +4.39 to now +6.15. So I think RPM is giving perhaps a bit too much credit here to Ariza that maybe should belong to Harden instead. Any metric would have a difficult time assigning the credit properly based on the minute distribution in this example. You’d be basing too much of the differentiation between the two players on those small sample sizes and/or box-score components only.
By nature of the single-season metric, RPM also can vary wildly from season to season – as I demonstrated in this public Tableau dashboard. Washington’s Bradley Beal is a great example here. His RPM has gone from +2.48 to -2.24 to +2.31 over the past three seasons, as of Wednesday night. Victor Oladipo, of Orlando and now Oklahoma City, has bounced around from -0.31 to +2.59 to +0.15. Miami’s Goran Dragic has bounced around from -0.34 to +2.12 to -0.69. LA’s Lou Williams has gone from +2.52 to -1.80 to +1.73.
I don’t think it’s cherry-picking to point out that RPM doesn’t agree on individual players very much from year to year. It happens in basketball too where context changes, players change and it’s difficult to ever get a “true” sense of how valuable a player might be. So I think RPM as a snapshot at a moment in time can be somewhat misleading in terms of what it actually represents. And it again leads to some cause for skepticism.
Jeff: For whatever reason, Kyrie Irving appears to be of intense focus for the RPM crowd. What is it about Kyrie that makes him such a focus? Does RPM put an emphasis on a player’s position and ranking within that context? Whenever Kyrie is discussed within the RPM community, his assist rate seems to be a focus, losing context that Kyrie is not the primary ball handler for Cleveland.
Jacob: Yes, you should absolutely look at position to get a context of how an individual player’s RPM compares around the league. On average, point guards have the highest ORPM. Centers have the highest DRPM. Shooting guards have the worst RPM. Overall, guards are high on offense and low on defense. It’s plausible this relates to the underlying box score components behind the statistic. I don’t know for certain. But you should absolutely be comparing within context.
For some context: Kyrie Irving has a -1.92 DRPM in the 2016-17 season so far. It was -2.68 in 2015-16 and -0.89 in the 2014-15 season. Overall, not great in the framework of the single-season stats we have available. But note that the average DRPM for a rotation point guard has been around -1.0 during this sample size. And replacement level is set at -2.35 league-wide, per Alex Schuhman’s explainer. So while the positional rankings or the pure negative numbers might look bad on the surface, they should always be treated with care.
I think Kyrie Irving is and will remain a hot-button topic for analysis. I said the same about Andrew Wiggins in a recent WFNY post, too. Players who have a scoring-focused role within the context of their team are notoriously difficult to analyze properly. That’s not necessarily a knock on either player to say they have a “role” on their team. And Irving and Wiggins are in different stratospheres from each other as of the 2016-17 NBA. Yet neither is called upon by their team to produce big rebound numbers, big assist numbers, or the like. Yes, sure, doing such things might pad their stats and make them look better in several topics. But for basketball purposes, that’s not their “role.”
Irving is an easy target, as Allen Iverson or even Kobe Bryant would be if they were playing their prime basketball years amid today’s wave of analytical measures. None of those three guards particularly produced big numbers besides points per game or had to play lockdown defense for 82 games a year. Their efficiency numbers left much to be desired, as oft occurs with smaller guards who carry a large scoring responsibility.
All three were and are superstars, as well, with signature shoes, All-Star games votes galore and adoring fans worldwide. There will always be counter-skeptics who then question if such stars are truly deserving of this adulation. And so I think, naturally, it becomes a near-Sisyphean task to accurately please all parties with a reasonable assessment of their on-court value.
Jeff: From the outside, it appears that RPM is a difficult tool to use in passing basketball discussions. It isn’t exact enough to use for hard comparisons, and has enough outliers to question guys who might pop out. This is exacerbated by the makers and users of RPM using it in ways that they specifically say it should not be used (rankings, etc.) How would you recommend using RPM correctly? Would you cite the exact RPM figures? Use it more as a percentile? Just say whether a player provided positive or negative value?
Jacob: First and foremost: Please, please stop using it as a ranking. (Yes, I’m even talking to you, Ben Alamar). As I mentioned above, the numbers depend on position and a comparison to replacement value that I think most are simply uninformed about. You should always take a look at how far away the numbers are spread out from each other, as opposed to just the pure rankings.
When a player has a higher or lower RPM than you expect, dive into why that might be the case. Is the ranking misleading, as opposed to the raw number itself? Is the player maybe closer to some players at his own position and/or with similar roles? Does the player play a higher-than-typical proportion of his minutes with certain teammate(s)? How does the player look in other advanced statistics? What do these various numbers tell us about our own biases and the potential gaps within analytics?
If you just expect RPM to say everything in one number, then you’re using it wrong. It helps to portray some degree of value on the basketball court. But it’s not and never will be the end-all, be-all for a discussion about a player. It’s extremely condescending to smart, informed basketball fans when anyone just begins and ends an argument with, “Well look at the RPM.”
Jeff: We’ve seen sites like Baseball Reference (bWAR), Fangraphs (fWAR, wRC+), and Pro Football Focus create metrics that have become part of the language when discussing MLB or the NBA. ESPN released their QBR metric, i.e. “total quarterback rating,” to use in their NFL coverage in 2011 and more recently RPM for the NBA. They also often discuss their Basketball Power Index or Football Power Index. Do you think ESPN is looking to “own” this space as these other metrics have become a common language in the sports world?
Jacob: ESPN is in the television, content and advertising business. If investing in an analytics staff, announcing out these new metrics, developing www.espn.com/analytics and even bringing along Nate Silver’s FiveThirtyEight.com didn’t make sense from a financial perspective, then I don’t think they’d be doing so. I believe many if not all of these ESPN employees are incredibly smart at the sports they cover and best practices for sports analytics. But nuanced, detailed analytical deep-dives and internet-driven clicks don’t often go hand-in-hand.
This likely then leads to the short-form, hot-take-worthy, clickbaity and headline-pandering nonsense that can come from the umbrella of ESPN and analytics. I don’t think FPI, BPI, RPM and QBR are nonsense on their own. I think they’re among the very best statistics we have to measure player value and team strength in their respective sports. But I think ESPN can often be too quick to tout the holier-than-thou significance of its own internal creations, as opposed to adding caveats and detailed definitions in every passing mention.
This makes sense for many reasons. If ESPN simply just nuanced the hell out of the stats every time it was presented on TV or online, then they’d be admitting how they’re not necessarily that useful as all-in-one estimates of sport value. For simplicity purposes and ease-of-understanding for the reader, they then leave out some of the gory details and explanations. Again, ESPN is not in the business of academia or statistics as a whole. That’s not the end game here.