Browns

Football analytics Q&A with data expert Trey Causey

The New York Times' Fourth Down Bot

Ever since Tuesday’s Cleveland Browns hiring of Paul DePodesta, the Internet has been buzzing with descriptions and stories of football analytics. For many, the evolution of sports analytics from the Moneyball days of the Oakland Athletics, to the Daryl Morey-ball play of the Houston Rockets, and also to the NFL, was seemingly inevitable.

But what really is football analytics? What does it mean? How can it help the perennially bad Cleveland Browns? I’ve wrestled with those questions in my head over the past 48 hours. And while I may be the Internet’s foremost sports analytics cataloguer — a bold proclamation, to be sure — I didn’t feel qualified enough to answer those questions myself.

If I were to curate some type of primer on football analytics for the Cleveland sports audience, the bulk of the post would have been citing individuals like Bill Barnwell, Brian Burke, Chris Brown, Chase Stuart, Bill Connelly, Benjamin Morris, et al. Instead of doing that, I felt the best thing to do was just to ask someone more qualified than me and have them answer some questions. And so I asked Trey Causey.

Causey writes online at The Spread, a website he founded that is dedicated to data science and sports. He once wrote an entire post on a blueprint for an analytical NFL franchise. He’s also one of the current minds behind the New York Times’ 4th Down Bot. I met Trey at last year’s Sloan Sports Analytics Conference and we’ve kept in touch since. He’s one of the top thought-leaders in the online sports analytics community, and he just happens to concentrate on football.

Jacob Rosen: So let’s just start from the basics. To you, what is football analytics? I often cite this Sports Analytics Blog post of various definitions of “analytics” from sports industry leaders. I’m curious for your thoughts on football, specifically, and how that might differ.

I think the general set of analytical skills is sport-agnostic, as it’s more of a way of thinking about how to solve problems with data than about specific subject matter expertise

Trey Causey: Football analytics is the rigorous application of data and the scientific method to football to inform decisions and maximize the chances of winning as many games as possible. This could be via player selection (in the draft, in free agency, on game day), in-game decision-making (fourth downs, two-point conversions, etc.), or in the design and analysis of plays. Football analytics is distinct from football “statistics” or football “data,” which I see as just collections of observations about the game which may or may not be useful to anyone.

I think a lot of people equate “analytics” with “numbers,” but that’s simply not the case. Counting things for the sake of counting things doesn’t help anyone. Similarly, building the most advanced statistical model possible with no use case in mind isn’t helpful, either. Analytics is the intersection of counting / modeling and implementation. I think it’s important that football analysts be familiar with concepts like cognitive biases to recognize when how one interprets a data set might be clouded by human error. Why do coaches go for it less on fourth down than the numbers say they should? This isn’t purely a data question, it’s a question of risk aversion, the game theoretic structure of the labor market for coaches, and the psychological implications of both.

JR: Within the context of implementation and football-specific decision-making, one of the main criticisms from Browns fans has been that someone who has worked in baseball for two decades might not be that informed in football. You’ve obviously dabbled a bit in some different sports, but it seems like you’ve done the bulk of your sports work in football. How do you think an individual can be successful applying analytical skillsets across different sports?

TC: I think the general set of analytical skills is sport-agnostic, as it’s more of a way of thinking about how to solve problems with data than about specific subject matter expertise. Those skills travel across domains — it’s not uncommon in academia for researchers who are experts in a field to collaborate with statisticians who are better at analysis than they are. I’ve had many conversations with people who work in a variety of sports and the “baseball knowledge won’t travel to football” doesn’t seem like an opinion many serious sports analysts would hold. There’s much more in common than not across various sports. Further, cross-pollination of ideas from different substantive areas is how new, good ideas come about.

JR: For many people, when they think of “football analytics” they’re immediately turned off. They’ll complain that the game is too complicated for the use of analytics like we’ve seen in baseball or basketball. How would you respond to that line of thinking?

TC: My day job is working as a data scientist in tech. I’ve worked for one of the biggest tech firms in the world, medium-sized firms, and startups. Every single firm has faced tremendously complicated problems with thousands of variables and up to billions of users. The number of times that these firms said that their problems were too complicated for analytics? Zero. Quite the opposite — the thinking is that the problems are so complicated that you must use analytics in order to succeed. The idea that football is somehow special in this regard is preposterous on its face and wouldn’t be taken seriously in almost any other industry.

One way that football is unique compared to other sports is in the small number of games played each year. This means that small advantages gained via analytics are more difficult to detect. Further, small samples are characterized by high variance, so you have to consistently apply analytical methods and be lucky in order to reap the benefits of analytics. Of course, this is true in every sport. There aren’t any champions that don’t benefit from luck along the way. That doesn’t make them any less successful or less of a champion.

JR: One of your more popular posts from the past year was when you commented on the sorry state on football analytics. Certainly, the public discourse on the topic hasn’t really improved like it has in the baseball and basketball communities. Do you have any more hope for this now with some of these moves? What do you think comes next?

TC: It’s way too early to tell based on these moves. I think the Browns are taking a very interesting approach and certainly seem to be moving in a certain direction that, from the outside looking in, is analytical. However, I don’t think that a few hires by a single team indicate any kind of shift or move me away from the argument that I made in that post. Many teams already have analytics staffs in place — how they use the advice from those staffs is key. If they ignore those staffs, it doesn’t really matter who they hire. If they don’t give the staffs enough time to make a difference, again, it doesn’t matter. Adopting an analytical approach on the one hand and trying to reconcile it with a “win now” mentality on the other is likely to fail.

JR: While the state of analytics within football front offices is still in its infancy, how do you think the public discourse can change for the better? When there are only 16 games, a lot of mainstream narratives will ride on just one game or one play. To a Browns fan or a Browns journalist, what more do you think they could learn or discuss or write about in order to be more involved in the analytics world?

TC: The only way for public discourse to change for the better is for more analytics people to find a voice in the media, to have the media take them seriously, and to have more teams publicly embrace analytics. It doesn’t help that there aren’t a lot of pro-analytics people in mainstream football media. It’s a much broader issue than just in football; thinking about statistics and probabilities is very hard to do and the brain isn’t really equipped to do so without a lot of practice. Of course, there are lots of smart people who write very well about analytical topics, but you have to seek them out. Mike Lopez, Ben Alamar, and Seth Partnow are all good examples of people who are great at writing about analytics without overpromising or dismissing people who aren’t convinced.

Many thanks to Trey for answering my questions. Make sure to follow @treycausey on Twitter. 

  • Pat Leonard

    This is perfect… really well stated by Causey and good questions to get those points across, Jacob. This should really help a lot of the folks who think that what the Browns are doing is some sort of wizardry and tomfoolery with numbers. As someone who works in big data for a healthcare software company, I can tell you that my industry and just about every other industry you can think of is currently benefiting from analytics. We actually just launched our own analytics team a few months ago and are already reaping the benefits. It would be absurd to think that the Browns and other NFL teams should not do the same.

  • MrCleaveland

    Terrific job, Jacob, by both you and Causey. That was explained to us novices very well and without any puzzling jargon.

    I’m looking forward to the time when we start getting some real specifics on this stuff as it applies to game day decisions, draft day decisions, etc.

    However, I’m afraid that since these will be “trade secrets,” the Browns aren’t likely to release too many specifics to the public. But maybe some spies will leak some to us.

  • markn95

    I enjoyed this article, particularly the argument that the complexity inherent in football argues FOR analytics rather than against it. The only red flag raised in my mind is the notion that coaches don’t “go for it” on 4th down as often as the statistics (excuse me, analytics) say they should. As a fan, that makes sense to me. But I can’t help but imagine a potential HC candidate reading this and bristling. It’s one thing for fans like me to play Monday Morning QB on sports talk radio; it’s another to have a someone in the front office second guessing you with reams of analytical data to back it up. Either we’ll wind up with a perfectly secure head coach who can tolerate stuff like that (are they any guys like that out there?) or someone with no other options who will listen to the anayltics guys and do what they say. The pessimist in me says the latter is far more likely.

    And–not to be snarky–can anyone imagine a scenario where a coach decides not to go for it on 4th down during a game and John DePodesta, say, texting him to do it?

  • Ezzie Goldish

    Agreed fully with the complexities of football and the need for analytics because of that complexity; that said, as he notes, between the few games and difference between plays and players and measuring outcomes vs. expectations it becomes so incredibly hard to implement any suggestions and see what it is that makes them work or not work. i.e. the “why” in the NFL is so much harder to gauge.

  • nj0
  • The_Real_Shamrock

    Informative as always.

    As it pertains to the Cleveland Browns: The proof will be in the pudding. I never buy anything that is a concept so this will be one party I’ll gladly be late for!

  • Abe

    I’ve been waiting for Jacob to weigh in. The links are indispensable. Thank you! In WFNY Jacob we trust!

  • tigersbrowns2

    good read … i think analytics should help in gathering every little bit of info possible on a player to help make better decisions … the only thing they can’t measure is heart & what’s in-between a player’s ears.

  • RGB
  • nj0

    The changes that analytics might bring (like how we think about punting) won’t happen overnight. Not to sound too dramatic, but what we’re talking about is akin to the Copernican Revolution: a paradigm shift that challenges accepted orthodoxy and the establishment. And those types of things take a LONG time to become accepted.

    I think a good equivalent to “going for it on 4th down” is the starting pitcher in baseball. Every team uses them, a strategy which is provably ineffective, yet even the most cutting-edge teams refuse to adopt something new. Bill James has a good piece where he argues that things are slowly changing and will one day be radically different.

    “In one hundred years, the 3-3-3 rotation, or some variation on it, will be the norm in professional baseball. Pitchers will be trained to make shorter appearances every third day. The concept of the pitching staff will undergo a dramatic change. Statistics such as ‘wins’ and ‘saves’ will become antiquated, replaced by other statistics that can better measure this new strategy.”

    http://www.billjamesonline.com/article1004/

  • nj0

    Why not? Teams already use the Wonderlic. And there’s clearly data that suggests how dedicated a player will be. Granted, the information isn’t perfect, but it’s out there.

  • Natedawg86

    3-3-3 rotation works for some, but some pitchers give up more in the first then have zeros until the 6-7 inning.

  • nj0

    How many? How often? Over how long of a sample? You can win some Holdem hands with 2-7 but that doesn’t make it a hand you should generally be playing.

  • Eric G

    Also to this point, it will take a LONG time to get the data to derive a “less risky” answer (as mentioned in the answers above, due to limited number of games), and even longer to implement, even in the play-caller is fully on board with the analytic discoveries (an example of the opposite coming from Brian Billick last night wp.me/p14QSB-9XsD) – numbers have to prove the absence of risk, everyone has to be on board, situation has to be perfectly aligned to analytic discovery. It’s not, “go for it on every 4th down.” It’s not even, “go for it on every 4th and 1.” Theoretically, it will be more along the lines of, “go for it on every 4th and 1 when you’re down by 3 or more and you’re past the 50 yard line and only 8 minutes remain.”

  • Hopwin
  • Hopwin
  • Garry_Owen

    I confess to being one of those “literally” bored to absolute death (as if there is another, less absolute form of the event) by sports analytics, but I am honestly curious how the whole thing works for football. In baseball, it seems to make sense, as each player basically faces the same event, events, or series of events through the course of a game and season with a basically limited number of outcomes: at bats means pitches means attempts means hits, outs, walks, etc. There’s not much deviation. The football field is more chaotic. Organized chaos. How does analytics measure those things?

    But, I guess here is where I confess that my curiosity only goes so deep. If anyone tries to explain it to me, I will “literally” gouge out my eyes with a spoon.

  • scripty

    Bill Belicheck saw that bombs down the sidelines had low completion rates and higher INT rates. Bombs right domb the middle worked much more b/c there were less variables involved (QB only had to get length of pass accurate, less directional challenge).

    Bill wanted all bombs thrown down the middle for NE. Teams weren’t throwing bombs down the middle b/c they felt safeties would get to more. NE changed routes to simply. NE bacame the most successful team throwing long routes due to this new change.

    /analytics, caveman level

  • Garry_Owen

    Okay. That’s good info. And I only got the spoon halfway under my left eye. Keep it caveman.

    It still seems to me that there are just far too many variables on the football field for analytics to have the broad application and impact that it enjoys in baseball.

  • Garry_Owen

    But . . . it still might be.

  • Garry_Owen

    Rooney rule, satisfied.

  • scripty

    If you draft better b/c of analytics, it touches every single play

  • Garry_Owen

    Sure, but that still seems very “cart, horse” to me. If football playing is analytically unquantifiable (at lease accurately), then how do you draft well using analytics?

    I’ll just trust that there’s actually a trustworthy, accurate way to do it, and hopefully enjoy the product on the field.

  • RGB

    I understand Dweeb-O-Metrics has become a useful tool, and I hope it pans out for the sake of our beloved Brownies.
    But, until it does, I reserve the right to post appropriate comments representative of the Team Snark company line.

  • Pat Leonard

    Pshhh, what has science ever done for us? “TV OFF!”

    https://pbs.twimg.com/media/CINm7LLWsAAqWvc.png

  • Pat Leonard

    I think it depends on what scope you apply. So you narrow down to things like, “How successful are screen passes for getting a first down on third and long” or “Does arm length correlate to completion percentages”. (And this is my own weak attempt to come up with questions that teams would want to answer). When it comes to plays, there are thousands to use as sample sets.

  • scripty

    draft well using analytics:

    Could be as simple as a few things:
    Say no pro bowl DE come past 45th pick, and you want an impact DE, dont take that player in 3rd round; consider re-aligning your priorities;
    Realize 40 yard times are useless for O Lineman, place priority on 10 yard dash
    Realize 40-yard times are not indicitive of high end LB rates, use shuttle drill and 10 yard dash
    It’s simply finding trends, testing the sample data, and then applying the data when making choices or strategy.

    People want to make some of this out as “QBs born in July that played for an SEC school are best taken between 30th and 50th overall”. Mainly you are trying to find data drivers and acting on them before the macro market catches on, creating lack of value for your choice

  • Sam Gold
  • Sam Gold
  • Greg Popelka

    Intelligence testing was introduced in Cleveland by Paul Brown.

    And it bears repeating that if he were around, he’d have been on the front edge of this.

  • nj0

    Agreed.

    And it’s not like analytics is just about determining the “correct” play call in every situation. It’s about making informed decisions based on reliable data.

    For all the hoopla surrounding Moneyball and sabermetrics, analytics have had a much smaller impact on in-game decisions when compared to how they’ve changed player evaluation/development and contracts/roster building.

  • nj0

    And it’s worth noting that teams currently use PLENTY of quantified measures to influence who they draft. His forty time is too slow for a WR. His Wonderlic isn’t high enough to play center. His arms aren’t long enough to play tackle. Etc. How valuable any of these are in evaluating players is debatable, but that’s part of what an analytical approach would try to determine.

    It’s the Browns asking – why do we do the things that we do the way that we do them?

  • mgbode

    I love how the truly smartest people don’t need to “sound” smart by using big words and jargon. They are able to portray it so most can at least get a feel for what they are stating. Causey is excellent in this regard.

  • Pat Leonard

    Yep! If the sample size for something you want to find out is too small, you just simply don’t do that study and you move on to the next hypothesis. On a macro level it sounds daunting and nebulous, but it’s really not. Are long arms really an indicator in the best tackles? Do they not matter at all for guards/centers? Do quarterbacks with the ability to run the football pick up 4th downs of 3 yards or less more often than quarterbacks who are purely pocket passers? These are analytical questions that can be answered because the sample size is large enough. The question is really “are you asking the right questions to get answers that lead to filling your biggest needs.”

  • Red State Socialism

    Many teams are benefiting without the use of math gimmicks. What do you even do with a 51-49 decision? It has no practical value.

  • Red State Socialism

    Can’t the other team also calculate the “correct” play of their opponent?

  • Pat Leonard

    Math gimmicks 🙂 I would doubt that a 51-49 majority would result in any specific action. I’m assuming there is some confidence level that needs to be reached.

  • Pat Leonard

    In theory yes, if they have the same information at the start and understand how to apply it. In that scenario, it would only remain a competitive advantage for a while. That’s true of anything though. The wildcat formation was only a competitive advantage for a while, but then everyone figured out how to do it and how to defend it.