Statistical Analysis. Humor. Knicks.

Saturday, December 20, 2014

A Layman’s Guide to Advanced NBA Statistics

This guide is intended for those that are interested in modern basketball statistics. In order to make it more accessible, I’ve decided to forgo the formulas and numbers. At times both fans and journalists alike struggle to use stats when it comes to basketball. Often enough, their interpretation is inadequate because they don’t have the right stats to explain what is happening on the court. Even worse is when stats are used improperly to arrive at the wrong conclusion.

Over the past few years basketball statisticians have learned a lot about the game. While most of it is based on the same stats you would see in boxscores, the findings go far beyond traditional stats. Evaluation on the team level is the most reliable aspect of basketball statistical analysis. In other words, we’re very sure what factors lead a team to victory. Although statisticians aren’t exactly sure how player stats equates to wins, there are many ways to better evaluate individuals than the classical stats.

Team Stats

What You Need to Know
When looking at team stats it’s important to understand that some teams play faster than others which skews their per game stats. Faster paced teams will get more chances to score per game, solely because they have more opportunities. It’s similar to two NFL RBs, both with 1000 yards rushing, but one had 300 attempts and the other only 200 attempts. In this case it’s not enough to know the totals, instead you have to account for the difference in the number of opportunities. The same applies for team stats.

So in lieu of viewing how a team performs per game, we calculate how a team does per possession. What’s a possession? A possession ends when a team gives the ball to the other team, usually through a score, a turnover, or a missed shot recovered by the defense. By using points per possession, we’re looking at how many points a team scores when they have the ball on offense. This is called offensive efficiency or offensive rating, and is measured in points per 100 possessions. Basically offensive efficiency answers the question “if this team had the ball 100 times, how many points would it score?” Similarly we can rate defenses by calculating how many points a team allows per possession, called defensive efficiency or defensive rating.

But it doesn’t stop there. We can break down what aspects of the game contributes to those rankings. Offense (or defense) is broken down into 4 crucial factors: shooting, turnovers, rebounding, and free throws. Shooting is by far the most important factor and is best measured by eFG% which is a better version of FG% (see “Shooting” below). Next come turnovers and rebounding which are about equal to each other, but less valuable than shooting percentage. Like points, turnovers are measured per possession (how many times you cough the ball up when you have it). Rebounding is measured by percentage of missed shots recovered. This is so teams that shoot poorly (have lots of misses to recover) are judged on an even platform with teams that can shoot. Last and least is free throw shooting. This is measured by free throw shots made per shot attempt.

In 50 Words or Less
Throw away points per game for team stats. Instead use offensive efficiency (or defensive efficiency), which is basically how many points a team would score in 100 possessions. Team stats are broken in four factors: shooting, rebounding, turnovers, and free throws. You can find these stats on basketball-reference (search for “Points Per 100 Possessions” and “Four Factors” on the team pages) and my stat site.

Examples Why
In 2006, Portland ranked 18th in points allowed per game, which means they should have been slightly worse than average. However they finished a paltry 21-61 that year. Their defense wasn’t adequately measured by points allowed per game, because they played at the league’s third slowest pace. Ranked by defensive efficiency they were 29th, which would make their 21 win season more understandable. Of course there’s the 1991 Denver Nuggets.

More Please
Dean Oliver (Points Per Possessions): http://www.rawbw.com/~deano/helpscrn/rtgs.html
Dean Oliver (Four Factors): http://www.rawbw.com/~deano/articles/20040601_roboscout.htm
Kevin Pelton: http://www.nba.com/sonics/news/factors050127.html
Basketball-Reference: http://www.basketball-reference.com/about/factors.html

Player Stats

What You Need to Know
Without a doubt per minute stats are more important that per game stats. This is because per minute stats makes valid comparisons between players of varying minutes. Using per game stats in the NBA is like using hits/game in MLB. In 2007 Michael Young averaged 1.29 hits/game to David Ortiz’ 1.22, but Young’s batting average was only .315 to Ortiz’ .332. Young had more hits because he had more at bats (639 to 549), not because he was a better contact hitter. Similarly you might find that one basketball player has better per game stats, but if he had more minutes then the comparison is invalid. Only per minute stats will clarify which player is truly better in a category.

The common notation for per-minute stats is using per 40 minute stats. This is because it’s easier to visualize 2.3 blk/40 min instead of 0.0575 blk/min. Measuring basketball stats per 40 minutes is similar to measuring earned runs per 9 IP in baseball (ERA). One thing to note, unlike ERA in baseball, basketball players’ per-minute stats stay the same despite their playing time. So while baseball relievers have lower ERAs than starters, the same is not true in basketball. Additionally this doesn’t mean a player should play 40 minutes, just as using ERA doesn’t mean that a pitcher should pitch a full 9 innings. It’s just a fair way to compare players.

In 50 Words or Less
Throw out a player’s per game stats, and look at per-minute stats instead. Per minute stats are usually measured per 40 minutes. Study, after study, after study shows a player’s per minute production to stay the same despite how many minutes they play. You can find them at basketball-reference for historical data, or my stat page for the current season.

Examples Why
Some examples of players that had good per minute numbers, but poor per game numbers due to a lack of playing time: Ben Wallace, Jermaine O’Neal, Gerald Wallace, and Michael Redd. Throw in a point guard, and that’s a pretty good team.

More Please
Kevin Pelton’s Stat Primer: http://www.nba.com/sonics/news/stats101.html
The Basketball Notebook’s Primer: http://basketballnotebook.blogspot.com/2005/12/basketball-notebook-stats-primer.html

Shooting

Another stat that should be replaced is FG%. Why? Field goal percentage doesn’t account for the scoring bonus in a three point shot, which is a lower percentage shot. Sharp shooter Kyle Korver’s career FG% (as of 2007) is a lowly 41.3%. If FG% rates a good shooter like Korver so poorly, then it’s obviously not a good stat to use. So replace FG% with eFG% (effective field goal percentage), which compensates for the extra point in a three point shot. Korver’s eFG% is a more robust 53.6%.

But eFG% isn’t the only statistic used to measure a shooter. True Shooting Percentage (TS%) accounts not only for three pointers, but free throws made as well. For instance a player that hits a layup, gets fouled, and hits the extra point is more valuable than the guy that just sinks a jumper. To compare players with respect to their total scoring contribution, this is the stat to use.

In 50 Words or Less
Field goal percentage (FG%) should be replaced by eFG% or TS%. Effective field goal percentage (eFG%) compensates properly for three pointers, while true shooting percentage (TS%) compensates for three pointers and free throws.

Examples Why
Well I used Kyle Korver above, but otherwise you can look at any player that takes a large amount of three pointers or gets (and converts) a lot of free throws. Players like Kevin Martin, Jason Kapono, Manu Ginobili, and Shawn Marion come to mind as players who are misrepresented by FG%.

More please
Kevin Pelton’s Stat Primer: http://www.nba.com/sonics/news/stats101.html
The Basketball Notebook’s Primer: http://basketballnotebook.blogspot.com/2005/12/basketball-notebook-stats-primer.html

Overall Player Value

As I mentioned earlier, it’s not exactly clear exactly how to calculate a player’s worth. However there are 3 main stats that have attempted to give a single number to represent a player’s total contribution. The first and most prevalent is Player Efficiency Rating (PER). Created by John Hollinger, it attempts to take add up the good things, subtract the bad things, and account for team pace and minutes played. It’s normalized to 15, which means the average player in the league scores a 15 PER. The league’s best players are around 30, while the worst are in the single digits. Following Hollinger is economist Dave Berri (and friends) who came up with Wins Produced and it’s cousin Win Score. Unlike Hollinger who chose his equation, Berri and co. statistically derived what factors went into Wins Produced.

But both stats have their weaknesses. According to Wins Produces, PER tends to overrate players that score a lot of points, but do so inefficiently (poor shooting numbers). Meanwhile PER says that Wins Produced overrate strong rebounders that score infrequently. Additionally since they both rely on box score stats, neither captures actions that occur outside of the stat sheet. For instance Bruce Bowen plays tough defense and forces Kobe Bryant to take a bad shot that Tim Duncan rebounds. The stat sheet will record Duncan’s rebound and Kobe’s missed shot, but Bowen doesn’t get any credit for his defense.

One stat that does capture Bowen’s effort is plus/minus stats. Currently kept by Roland Beech, +/- comes in a few different flavors. Among the most popular are offensive and defensive +/-, which measure how a team does with the player on the court. Also Roland Rating and net +/- attempt to evaluate a player’s value. However plus/minus doesn’t just capture than the individual effort, it captures the value of his teammates as well. When Bowen and Duncan prevent the Lakers from scoring not only do they get credit for the effort, everyone else on the court gets the credit as well.

In 50 Words or Less
Trying to create a player’s total worth using a single number isn’t highly reliable. But if you need to use one, you can try PER, Wins Produced, or +/-. Each has their strengths & weaknesses and are only good to begin a discussion, not end one.

Examples Why
The biggest hole in statstical analysis is defensive stats. Blocks, rebounds, and steals aren’t enough to tell the whole story on what happens on defense. Players who excel in this area of the court usually have strong defensive +/-, like Bruce Bowen (-9.6). However these numbers tend to fluctuate based on the strength of the team. A player that spends a lot of time on the court with strong defensive players will have their defensive +/- inflated.

More please
Kevin Pelton’s Stat Primer: http://www.nba.com/sonics/news/stats101.html
What is PER?: http://sports.espn.go.com/nba/columns/story?id=2850240
Dave Berri’s Site: http://dberri.wordpress.com/2006/05/21/simple-models-of-player-performance/
Roland Rating: http://www.82games.com/rolandratings0405.htm
Adjusted +/-: http://www.82games.com/ilardi1.htm
Online & Downloadable +/- stats: http://basketballvalue.com/index.php

16 comments on “A Layman’s Guide to Advanced NBA Statistics

  1. Owen

    Excellent post. Very excellent post.

    Obviously I love the WOW, but I think the big plausible advance basketball statisticians could make, a giant leap forward so to speak, would be to get ts% listed on NBA and College telecasts.

    Re Bruce Bowen, I agree with you that +/- should capture his off box score abilities. The WOW numbers would suggest that you could replace him with a below average small forward and the Spurs wouldn’t notice the difference. And that certainly flys in the face of conventional wisdom.

    But the story +/- tells gets a bit confusing when you look back. Bowen’s defensive impact last year was enormous, he made the Spurs 9.6 points better on defense. The year before though the Spurs were actaully 2.7 points WORSE on defense with Bowen. The year before he made the defense 1.5 points better, not really statistically significant. The Year before that. 4.5 points better on defense. So the numbers vary pretty wildly, especially given his reputation for a being a dominant defender.

    One of the important questions, which different stat systems seem to give different answers to is, how consistent are basketball players year to year? Berri would say very consistent, more consistent than in any other sports. PER would say pretty consistent. +/- would say quite inconsistent. That’s a really important point to consider in our role as armchair gm’s. Will a player’s performance change if he signs with a new team. Do players improve really? Is it possible for Eddy Curry to make the jump to being an above average center? (lol, couldn’t get through a post without bashing Curry…)

    In fact, that probably is the most important question basketball statistics can answer, at least from the perspective of improving the Knicks…

    Finally, you might want to throw in a comment about adjusted +/-, saw this article over at 82games, looks like they are going to be pushing that more strongly…

    http://www.82games.com/ilardi1.htm

  2. erik

    thanks. i used to be a big basketball fan as a kid, as an adult i’ve been more into baseball. ii love the stuff the Hardball Times and Baseball Prospectus comes out with. it gives you an understanding of the game that makes you feel like an insider. i didn’t think it was possible in basketball, but this is great stuff. i will be able to enjoy the game with a whole new appreciation i didn’t think possible this year.

  3. Lpmatt

    I would love to see some 1st team/2nd team stats if Isiah makes mass subs when the starters are playing like crap like he did some games last year. Like maybe Zach, Eddy, Q, Jamal and Steph are first team then if 3 of them are off the court than the team playing is considered the 2nd team say maybe Steph, Zach, Malik, Lee and Nate which is the 5 that I really liked during the Boston game.

    I really do feel that the best 5 for the Knicks is Nate, Steph, Lee, Balkman and Zach.

  4. dave crockett

    Owen, a few years back before Dunleavy took the Clippers job he used eFG% on his telecasts. If memory serves he may have even trumpeted TS%. I was hoping it would catch on, but it didn’t.

    I’m with you, I think even the stereotypical “Joe Six-Pack” fan would incorporate it. I suppose the big obstacle are the basketball versions of Joe Morgan. I’ve heard intelligent people who can’t get past the per 40 thing.

  5. Kevin Pelton

    Dunleavy inveted eFG% back during his playing days, shortly after the three-point line came into existence. However, his term for it is True Shooting Percentage.

  6. Alejandro

    We could also add somthing regarding rebounding. Good rebounding teams aren’t the ones that get more boards per game, but the ones that get the highest percentage of available rebounds. Fast-paced games usually lead to more shots and therefore more rebounds, but that doesn’t make those teams better rebounding ones.

  7. Owen

    Lol, is that THE Kevin Pelton. You are stat geek when that gets you excited I guess…

    What happened to the comment in the left column. Is that gone forever. Am I going to be driven to an rss feed for comments?

  8. Mike Goodman

    “… Study, after study, after study shows a player?s per minute production to stay the same despite how many minutes they play…”

    This is getting preachy. An equal number of studies has shown that players who play better get more minutes. Need I add, “duh”.

  9. Thomas B.

    I kept my promise by reading the Layman’s guide. So, what did I learn? Actually, I learned quite a bit (I think).

    One of the problems I have had this off season is figuring out who should start a SF for the Knicks next year. Based on the number of time I cursed at Q and JJ this year I though it should not be either of them. I thought it should be either Balkman or Chandler, but which one? I figured let the numbers tell the tale. So I used Mike K.’s excellent stat page (Chap Stick please) to figure out which of our small forwards should be starting. I don’t have numbers for Gallanri, so he got left out.

    I decided to compare 40 minutes stats because “Study, after study, after study shows a player’s per minute production to stay the same despite how many minutes they play.” (More Chap stick). However Mike, I think Per 40 minute stats cant be viewed in a vacuum. Otherwise, you get freakish number s from a person who clearly can’t maintain that pace e.g. Jerome James. I also wanted to look at things Usage-r and eFG% together because I think that a player’s ability to create his own shot is only valuable if he can do it efficiently. Finally, I wanted to look at a Stat system outside of Mike’s, just to give some perspective. I used the Roland rating at 82games.com.

    Q:

    I found that Q should not be starting. Q had the second lowest PER of any Knick SF (8.61) and the second lowest eFG (42.1). Those are not good numbers from SF who is known as an offensive player. Q’s Roland rating of -6.4 is lowest of all Knick small forwards. Basically, Q was the least valuable SF to the Knicks. When you consider the number of minutes Q plays, you realize that he is dead weight. When you combine him with Curry, it drags the team to a near standstill. So the numbers support that Q should not start. I’m not sure whether D’Antoni will bench Q or try to let him recapture his career season of 2004-05 when he had a Roland rating of +5.7.

    JJ:
    Jared was even worse than Q. He had the lowest PER at 8.01, the lowest eFG at 40.7, and the lowest Usage rate 12.1. Well one might say, “Yeah, but JJ is not an offensive player. He rebounds and plays D.” Does he do that well enough to justify the anemic offensive numbers? The stats say “No.” Though JJ played similar minutes to Chandler and Balkman, his Rebounding rate is lower. He also sports the highest turnover rate of any Knick small forward. He had the highest Roland rating at -3.3.

    Chandler:

    Many people-me included-were excited about Chandler’s play at the close of the season. Chandler had the highest PER at 11.84-which is still below average but not bad when compared to all Rookies. He was in the top 25 of all rookies for PER and had a higher PER than Yi Jillian, Glen Davis, Nick Young, and Javaris Crittenden. Chandler had the highest Usage rate at 17.9 and the second highest eFG% and tFG% (48). But according to the Roland rating, Chandler was only marginally more valuable to the team than Q (-5.3 compared to -6.4). I’m not sure how to explain that.

    Balkman:

    Balkman sports the 2nd highest PER at 11.61. He has the highest eFG% (49.2) and highest TS% (49.2). He also has the highest FT/FG rate (22). Balkman does not create his own shot (11.7 USGr) but he is a more efficient scorer than the other SF’s. Balkman’s Points per 40 min is only 2 points lower than Q’s but Balkman brings much more efficient scoring and the highest rebounding rate of any Knick SF (12.9).

    So who should start? My vote would be for Balkman because I love the efficiency and rebounding. But given that D’Antoni’s style of play values players that can create their own shot, Chandler may be the best choice as he has the highest USGr and second highest shooting efficiency numbers.

    The numbers dont justify allowing Q and JJ to play at all. So, do I get it now?

  10. caleb

    Nice roundup, Thomas.

    If I understand Roland ratings right (not saying I do), it’s largely a “total production” measure as opposed to a strictly “efficiency” measure. So, a guy who plays few minutes, i.e. Chandler, will not rack up a high Roland rating.

    I am also a big Balkman fan — I think his value is even higher than the numbers, because he is a good on-ball defender with potential to be a great one — and that is only a small factor in the numbers (via plus-minus). I actually don’t think there’s an issue of fitting in D’Antoni’s system — as far as I can tell, the real premise is getting off quick shots before the defense is set. (i.e. — the polar opposite of “The Quick,” with Randolph holding it for 10 seconds, or waiting until Curry is triple-teamed to pass him the ball). You get quick shots by running and passing, not by leaving it to individual players to create their shot.

    I think Chandler could be a good starter, too, eventually, but the sample size is very small and he is still extremely raw.

  11. Thomas B.

    Thanks Caleb.

    Do you think GM’s consider per 40 numbers and PERs before they pursue a player? I went back and looked at Jared’s production for the season prior to his signing with the Knicks (2005-06). JJ’s numbers don’t at all justify the contract Isiah gave him (and dont let me get started on James’ numbers based on the three seasons prior to his windfall). JJ’s net production was a -3.9, meaning he produced less than the opponent. His on/off court production was a push. His per 40 minute numbers… 10 ppg, 7.7 rebs, and 2.9 assists. How did this justify nearlu tripling his salary? It does not!!

    82games.com, in addition to the Roland ratings, provides a “fair slary” which represents the salary earned based on production and minutes played. In 2005-06 JJ fair salary was 2.09 mill and his actual salary was 2.4 million. So according to 82games.com, JJ was not even earning his keep at 2.4 million. So Isiah either ignored the numbers or failed to look into them becuase he nearly tripled JJ’s salary with no reason to think that JJ could even earn half that amount.

    Also for those of you who thought the Knicks should have drafted a point, acording to 82games.com the SF position was the least productive position last year (lowest PER). SF also gave up the highest PER (19.5) to opponents. Most of that is due to Q and JJ, who played the majority of SF minutes and have an average PER of 8.32. Balkman and Chandler avearge a PER of 11.7. I blame Isiah for this. I think he felt he need to push JJ and Q becuase those were the big money players he brought in to play SF. He would have been better served by playing the players he drafted.

    I like this stat stuff.

Comments are closed.