Computers vs. Scouts

With 16 days left before the draft, teams are busily working out prospects, haggling with agents and breaking down game tape. Could there be an easier way? Last week we were talking about computer rating systems that purport to identify the best draft prospects, without the messy work of actually watching games, administering brain profile tests and trekking through rundown former Soviet airports.  The two systems that have garnered the most attention were designed by Erich Doerr, a David Berri disciple, and ESPN’s John Hollinger. A good summation and explanation of the systems was posted here last year:

 http://www.knickerblogger.net/index.php/2007/06/26/draft-analysis-by-the-numbers/

The short version: Doerr’s “PAWS” (Pace-Adjusted Win Score) rating looks solely at college game statistics, ranking players using Berri’s winscore metric. He adjusts for strength of schedule (40 points against Kansas means more than 40 points against Helen Keller). Doerr’s posts are not easy reading, but it appears he simply takes the best-ranked players and assumes that those are the best NBA prospects.  

Hollinger starts with a similar approach, using his PER (Player Efficiency Rating), calculated with college game statistics.  Unlike Doerr, he makes a number of adjustments.  In essence, this puts Hollinger closer to mainstream draft gurus. He finds that players who are tall for their position tend to do better in the pros. He also finds that certain statistics – like steals – are specific markers of athleticism. Unlike Doerr, Hollinger also takes age into account. An 18-year-old prospect with the same numbers as a 22-year-old prospect (or even a 19-year-old) gets a significantly higher rating.

Even the creators of these systems would say that they are only a tool, and very much a work-in-progress.  A smart drafter might use them to identify promising players to whom he or she hadn’t paid much attention, or to raise red flags about prominent players who might be overrated.  Here’s a comparison of what the Hollinger & Doerr computers spit out last year, along with the actual draft order.  Since both systems rate only college players, for the sake of side-by-side comparison I left Yi Jianlan and Marco Belinelli off the actual draft list.  That’s right, these methods won’t tell you who’s tearing up the Italian League, or whether OJ Mayo’s high school career was more impressive than that of LeBron James.

Oh, and Nick Fazekas? He only played 269 minutes as a rookie, but per 40 minutes he scored 15.9 points and had 13.2 rebounds, along with a TS% of 58.2 and a rebound rate better than David Lee or Zach Randolph.

     Hollinger                                           Doerr                              Actual 2007 Draft

1. Kevin Durant                                  Nick Fazekas                       Greg Oden

2. Greg Oden                                     Kevin Durant                        Kevin Durant         

3. Mike Conley, Jr.                             Al Horford                            Al Horford

4. Thaddeus Young                           Greg Oden                            Mike Conley, Jr.

5. Brandan Wright                             Joakim Noah                          Jeff Green

6. Al Horford                                     Jared Dudley                         Corey Brewer

7. Nick Fazekas                                 Jason Smith                           Brandan Wright

8. Josh McRoberts                             Morris Almond                       Joakim Noah

9. Rodney Stuckey                             Julian Wright                         Spencer Hawes

10. Jared Dudley                                Brandan Wright                     Acie Law, IV

11. Joakim Noah                                Rodney Stuckey                    Thaddeus Young

12. Glen Davis                                   Al Thornton                           Julian Wright

13. Sean Williams                               Mike Conley, Jr.                    Al Thornton

14. Jeff Green                                     Glen Davis                            Rodney Stuckey

15. Kyle Visser                                   Daequan Cook                      Nick Young

16. Herbert Hill                                  Marcus Williams                    Sean Williams

17. Javaris Crittenden                         Jeff Green                             Javaris Crittendon

18. Wilson Chandler                            Sean Williams                      Jason Smith

19. Julian Wright                                 Corey Brewer                        Daequan Cook

20. Daquean Cook                              Derrick Byars                        Jared Dudley

 

 

2/29 Two Quick Links

NYSUN: Vandeweghe Would Succeed Only if Isiah Isn’t Around

[Vandeweghe’s] record does come with some warts. He served as general manager of the Nuggets from 2001 through 2006, helping to rebuild Denver from a lottery team into a playoff contender. The key deal was, not surprisingly, a trade with the Knicks — he got Marcus Camby and the rights to big man Nene from New York in return for Antonio McDyess. He also made a solid move when he signed point guard Andre Miller to a free-agent deal.
However, the rest of his résumé looks spottier. He gave up three first-round picks in the sign-and-trade deal with New Jersey for Kenyon Martin, and Martin’s seven-year, $91 million contract has been one of the league’s worst values. He also passed on Amare Stoudemire in the 2002 draft … twice. One of them was the Nene choice, and the other was all-time bust Nikoloz Tskitishvili.
That said, if he’s hired by the Knicks his biggest move will be choosing the next coach … or rather, that’s what it should be. If he’s stuck with Isiah, he probably won’t accomplish much.
Nonetheless, it would offer a very slight glimmer of hope that perhaps things might get less awful. He’d presumably have the power to start trading the many misshapen pieces of this roster. And one hopes, at least, he’d have Dolan’s commitment to a genuine rebuilding project as opposed to the slapdash quick fix Isiah tried when he took over.
But it’s puzzling that Dolan can’t realize the huge public relations boost he’d get from cutting the cord with Isiah entirely. The fan base would be rejuvenated, to the point that they’d actually be willing to sit tight and support the team through the inevitable multi-year rebuilding job.

Diminishing Returns and the Value of Offensive and Defensive Rebounds
More Diminishing Returns

In some ways I think this study provides stronger evidence for the impact of diminishing returns on defensive rebounding than my previous post. The charts allow one to easily see the effects of diminishing returns, and by looking at the rebounding of all the players in each lineup, the issues brought up by coaches potentially pairing good rebounders with poor rebounders are largely eliminated.

The specific marginal values found of 0.8 for offensive rebounds and 0.3 for defensive rebounds are also interesting. These match closely with how John Hollinger’s PER weights offensive rebounds relative to defensive rebounds (ORB are weighted by the league DRB%, which is around 0.7, and DRB are weighted by the league ORB%, which is around 0.3). And again, these values suggest that Dave Berri’s Wins Produced greatly overvalues players with high defensive rebounding percentages and undervalues players with low defensive rebounding percentages because the system assumes that each player DRB contributes a full DRB on the team level. Alternative Win Score (or AWS), the variation on Wins Produced suggested by Dan Rosenbaum in his paper, “The Pot Calling the Kettle Black”, weights ORB at 0.7 and DRB at 0.3. While these values are based on an assumption and not backed by evidence (just like Berri’s assumption that both should be weighted at 1 is not backed by any evidence), the evidence from the study I have done here (and Cherokee_ACB’s study) suggests that AWS (and PER) may be a lot closer to the mark on rebounding than Wins Produced.

Is This Worse Than Any Isiah Trade?

It is now official, Shaquille O’Neal has been dumped traded to the Phoenix Suns in exchange for Shawn Marion and Marcus Banks. I think we all, more or less, agree that this is a horrible trade for the Suns, trading the better, younger player on a team with the best record in the Western Conference for an older, worse player who, as a kicker, is not just injury prone, but currently injured.

What I wonder, though, is this such a bad trade that it is even worse than any Isiah trade? Read More

A Layman’s Guide to Advanced NBA Statistics

This guide is intended for those that are interested in modern basketball statistics. In order to make it more accessible, I’ve decided to forgo the formulas and numbers. At times both fans and journalists alike struggle to use stats when it comes to basketball. Often enough, their interpretation is inadequate because they don’t have the right stats to explain what is happening on the court. Even worse is when stats are used improperly to arrive at the wrong conclusion.

Over the past few years basketball statisticians have learned a lot about the game. While most of it is based on the same stats you would see in boxscores, the findings go far beyond traditional stats. Evaluation on the team level is the most reliable aspect of basketball statistical analysis. In other words, we’re very sure what factors lead a team to victory. Although statisticians aren’t exactly sure how player stats equates to wins, there are many ways to better evaluate individuals than the classical stats.

Team Stats

What You Need to Know
When looking at team stats it’s important to understand that some teams play faster than others which skews their per game stats. Faster paced teams will get more chances to score per game, solely because they have more opportunities. It’s similar to two NFL RBs, both with 1000 yards rushing, but one had 300 attempts and the other only 200 attempts. In this case it’s not enough to know the totals, instead you have to account for the difference in the number of opportunities. The same applies for team stats.

So in lieu of viewing how a team performs per game, we calculate how a team does per possession. What’s a possession? A possession ends when a team gives the ball to the other team, usually through a score, a turnover, or a missed shot recovered by the defense. By using points per possession, we’re looking at how many points a team scores when they have the ball on offense. This is called offensive efficiency or offensive rating, and is measured in points per 100 possessions. Basically offensive efficiency answers the question “if this team had the ball 100 times, how many points would it score?” Similarly we can rate defenses by calculating how many points a team allows per possession, called defensive efficiency or defensive rating.

But it doesn’t stop there. We can break down what aspects of the game contributes to those rankings. Offense (or defense) is broken down into 4 crucial factors: shooting, turnovers, rebounding, and free throws. Shooting is by far the most important factor and is best measured by eFG% which is a better version of FG% (see “Shooting” below). Next come turnovers and rebounding which are about equal to each other, but less valuable than shooting percentage. Like points, turnovers are measured per possession (how many times you cough the ball up when you have it). Rebounding is measured by percentage of missed shots recovered. This is so teams that shoot poorly (have lots of misses to recover) are judged on an even platform with teams that can shoot. Last and least is free throw shooting. This is measured by free throw shots made per shot attempt.

In 50 Words or Less
Throw away points per game for team stats. Instead use offensive efficiency (or defensive efficiency), which is basically how many points a team would score in 100 possessions. Team stats are broken in four factors: shooting, rebounding, turnovers, and free throws. You can find these stats on basketball-reference (search for “Points Per 100 Possessions” and “Four Factors” on the team pages) and my stat site.

Examples Why
In 2006, Portland ranked 18th in points allowed per game, which means they should have been slightly worse than average. However they finished a paltry 21-61 that year. Their defense wasn’t adequately measured by points allowed per game, because they played at the league’s third slowest pace. Ranked by defensive efficiency they were 29th, which would make their 21 win season more understandable. Of course there’s the 1991 Denver Nuggets.

More Please
Dean Oliver (Points Per Possessions): http://www.rawbw.com/~deano/helpscrn/rtgs.html
Dean Oliver (Four Factors): http://www.rawbw.com/~deano/articles/20040601_roboscout.htm
Kevin Pelton: http://www.nba.com/sonics/news/factors050127.html
Basketball-Reference: http://www.basketball-reference.com/about/factors.html

Player Stats

What You Need to Know
Without a doubt per minute stats are more important that per game stats. This is because per minute stats makes valid comparisons between players of varying minutes. Using per game stats in the NBA is like using hits/game in MLB. In 2007 Michael Young averaged 1.29 hits/game to David Ortiz’ 1.22, but Young’s batting average was only .315 to Ortiz’ .332. Young had more hits because he had more at bats (639 to 549), not because he was a better contact hitter. Similarly you might find that one basketball player has better per game stats, but if he had more minutes then the comparison is invalid. Only per minute stats will clarify which player is truly better in a category.

The common notation for per-minute stats is using per 40 minute stats. This is because it’s easier to visualize 2.3 blk/40 min instead of 0.0575 blk/min. Measuring basketball stats per 40 minutes is similar to measuring earned runs per 9 IP in baseball (ERA). One thing to note, unlike ERA in baseball, basketball players’ per-minute stats stay the same despite their playing time. So while baseball relievers have lower ERAs than starters, the same is not true in basketball. Additionally this doesn’t mean a player should play 40 minutes, just as using ERA doesn’t mean that a pitcher should pitch a full 9 innings. It’s just a fair way to compare players.

In 50 Words or Less
Throw out a player’s per game stats, and look at per-minute stats instead. Per minute stats are usually measured per 40 minutes. Study, after study, after study shows a player’s per minute production to stay the same despite how many minutes they play. You can find them at basketball-reference for historical data, or my stat page for the current season.

Examples Why
Some examples of players that had good per minute numbers, but poor per game numbers due to a lack of playing time: Ben Wallace, Jermaine O’Neal, Gerald Wallace, and Michael Redd. Throw in a point guard, and that’s a pretty good team.

More Please
Kevin Pelton’s Stat Primer: http://www.nba.com/sonics/news/stats101.html
The Basketball Notebook’s Primer: http://basketballnotebook.blogspot.com/2005/12/basketball-notebook-stats-primer.html

Shooting

Another stat that should be replaced is FG%. Why? Field goal percentage doesn’t account for the scoring bonus in a three point shot, which is a lower percentage shot. Sharp shooter Kyle Korver’s career FG% (as of 2007) is a lowly 41.3%. If FG% rates a good shooter like Korver so poorly, then it’s obviously not a good stat to use. So replace FG% with eFG% (effective field goal percentage), which compensates for the extra point in a three point shot. Korver’s eFG% is a more robust 53.6%.

But eFG% isn’t the only statistic used to measure a shooter. True Shooting Percentage (TS%) accounts not only for three pointers, but free throws made as well. For instance a player that hits a layup, gets fouled, and hits the extra point is more valuable than the guy that just sinks a jumper. To compare players with respect to their total scoring contribution, this is the stat to use.

In 50 Words or Less
Field goal percentage (FG%) should be replaced by eFG% or TS%. Effective field goal percentage (eFG%) compensates properly for three pointers, while true shooting percentage (TS%) compensates for three pointers and free throws.

Examples Why
Well I used Kyle Korver above, but otherwise you can look at any player that takes a large amount of three pointers or gets (and converts) a lot of free throws. Players like Kevin Martin, Jason Kapono, Manu Ginobili, and Shawn Marion come to mind as players who are misrepresented by FG%.

More please
Kevin Pelton’s Stat Primer: http://www.nba.com/sonics/news/stats101.html
The Basketball Notebook’s Primer: http://basketballnotebook.blogspot.com/2005/12/basketball-notebook-stats-primer.html

Overall Player Value

As I mentioned earlier, it’s not exactly clear exactly how to calculate a player’s worth. However there are 3 main stats that have attempted to give a single number to represent a player’s total contribution. The first and most prevalent is Player Efficiency Rating (PER). Created by John Hollinger, it attempts to take add up the good things, subtract the bad things, and account for team pace and minutes played. It’s normalized to 15, which means the average player in the league scores a 15 PER. The league’s best players are around 30, while the worst are in the single digits. Following Hollinger is economist Dave Berri (and friends) who came up with Wins Produced and it’s cousin Win Score. Unlike Hollinger who chose his equation, Berri and co. statistically derived what factors went into Wins Produced.

But both stats have their weaknesses. According to Wins Produces, PER tends to overrate players that score a lot of points, but do so inefficiently (poor shooting numbers). Meanwhile PER says that Wins Produced overrate strong rebounders that score infrequently. Additionally since they both rely on box score stats, neither captures actions that occur outside of the stat sheet. For instance Bruce Bowen plays tough defense and forces Kobe Bryant to take a bad shot that Tim Duncan rebounds. The stat sheet will record Duncan’s rebound and Kobe’s missed shot, but Bowen doesn’t get any credit for his defense.

One stat that does capture Bowen’s effort is plus/minus stats. Currently kept by Roland Beech, +/- comes in a few different flavors. Among the most popular are offensive and defensive +/-, which measure how a team does with the player on the court. Also Roland Rating and net +/- attempt to evaluate a player’s value. However plus/minus doesn’t just capture than the individual effort, it captures the value of his teammates as well. When Bowen and Duncan prevent the Lakers from scoring not only do they get credit for the effort, everyone else on the court gets the credit as well.

In 50 Words or Less
Trying to create a player’s total worth using a single number isn’t highly reliable. But if you need to use one, you can try PER, Wins Produced, or +/-. Each has their strengths & weaknesses and are only good to begin a discussion, not end one.

Examples Why
The biggest hole in statstical analysis is defensive stats. Blocks, rebounds, and steals aren’t enough to tell the whole story on what happens on defense. Players who excel in this area of the court usually have strong defensive +/-, like Bruce Bowen (-9.6). However these numbers tend to fluctuate based on the strength of the team. A player that spends a lot of time on the court with strong defensive players will have their defensive +/- inflated.

More please
Kevin Pelton’s Stat Primer: http://www.nba.com/sonics/news/stats101.html
What is PER?: http://sports.espn.go.com/nba/columns/story?id=2850240
Dave Berri’s Site: http://dberri.wordpress.com/2006/05/21/simple-models-of-player-performance/
Roland Rating: http://www.82games.com/rolandratings0405.htm
Adjusted +/-: http://www.82games.com/ilardi1.htm
Online & Downloadable +/- stats: http://basketballvalue.com/index.php

One More Nail In the Anti-Per Minute Argument’s Coffin?

One of the core tenets of basketball statistical analysis is the usage of per minute stats. When compared to per game stats, per minute stats are highly valuable in the evaluation of individuals. This is because per minute stats puts players of varying playing time on the same level. Using per game stats, starters will always dwarf bench players due to the extended time they get to accumulate various stats. Meanwhile per-minute stats allows to compare players independent of minutes, allowing for a more even approach in player evaluation.

Recently a debate has come up on the validity and usefulness of per minute stats. I’ve quoted the main parts below, but even abbreviated it’s a long read. If you have the time, I suggest reading it now so the rest of this article will make more sense. For those on a limited time constraint, a quicker summary is here:

Hollinger & Kubatko: “Hey per minute stats are a great way to evaluate players! In fact we’ve done a few studies and it seems that a player’s per minute stats increase slightly when they get more minutes. At the worst we can conclude that they should stay relatively the same.”

FreeDarko: “Per minute stats won’t stay the same if a player gets more minutes, because there is a division between greater and lesser players. A player that only gets 10-25 minutes per game is playing against lesser caliber players. Hence when that player sees an increase in playing time, he’s playing against steeper competition, so his stats should decrease.”

Tom Ziller: “That’s not true. Here is every 10-25 minute player in the last 10 years that saw an increase in minutes. Most of them (70%) saw an increase in per-minute production. To discount any of this data being from young players getting better as they age, I looked at 8+ year vets, and saw that about the same ratio of players increased (69%).

Brian M.: “Tom, the problem with all this data is a causality vs. correlation issue. It’s possible that these players saw more minutes first then improved. But it’s also possible that these players improved first which allowed their coach to play them more minutes.”

Brian’s case is a good one. To use an analogy, imagine I come across a person who calls himself Merlin Appleseed. He claims that just by touching apples he can magically make them taste better. He opens up a box of apples saying that he never touched any of them. He picks out 10, and imbues them with his magic. He asks me to taste each of them. I find all of them to be delicious. He says “here’s the same box I got my apples from. Now I want you to take 10 at random while blindfolded. You can compare them to my magic apples. I bet mine taste better.” I do just as he asks, and indeed my random set of apples are less tasty than his. So does Merlin Appleseed have magical power?

Maybe. Unfortunately this test wouldn’t be able to confirm or deny his magical power. Since Merlin gets to choose his apples, he might be selecting the best ones! To test Merlin’s abilities I would need something to gauge how good his apples are expected to taste. One way to do this would be to find comparable apples that have the same color, size, blemishes, etc. Then I can compare the taste of his apples to my apples. If Merlin’s has the magical powers he claims, then his apples will taste better than my apples.

Similarly with Tom’s study, Brian is saying that by selecting players who have seen an increase in minutes we might be choosing the best apples. This is because players who improve on a per minute basis could be given more playing time by their coaches. Therefore to show whether or not these players have improved, I need to find how good they’re expected to be. Then I can compare their actual performance to their expected performance. If FreeDarko’s theory is true, that role players should decrease their per minute production with more minutes, then they should perform worse than their expected values.

To separate the control group from the test group, I’ll only use players with an even numbered age for the control, and odd numbered ages for the test group. Since this study is intended for role players, which was defined by Ziller, I limited my control group to player seasons where:
* The player age was an even number.
* The player appeared in 41 games or more.
* The season was 1981 or greater.
* The player averaged 10-25 mpg.

Now I can calculate the expected production of the players in my group, by looking at per minute production (PER) over playing time (mpg).

Control Group

Just as expected, the graph tends to go from the bottom left (low production = low minutes) to the top right (high production = high minutes). That is players who receive more minutes are more productive. From the 1840 player-seasons in my data, I’m able to calculate the expected PER based on mpg (PER = .2158*mpg + 8.2941). So if a player averaged 10 mpg, you would expect his PER to be 10.45. This equation is represented by the red line on the graph.

Now that our control group is defined, I need to create the test group. Again this group was defined by Ziller as role players who saw an increase in minutes. I selected player seasons where:
* The player’s age was an odd number.
* The player appeared in 41 games or more.
* The season was 1981 or greater.
* The player averaged 10-25 mpg the year before.
* The player increased his mpg by 5+ from the year before.

Since I have the expected values based on mpg, all that is left is to compare their actual production to the control group. In our test group 185 players did better than their expected PER, while 177 did worse. On average each player gained 0.17 PER. This is a tiny gain, not enough to show that players increase production with more minutes. However it clearly shows that they didn’t decline and at least matched the predicted PER.

Another way to see how our prediction did is to calculate the regression (trendline) of this group, and compare it to the expected equation. The red line in the graph below shows the regression of PER/MPG for our control group.

Test Group

* Control: PER = .2158*mpg + 8.2941
* Test: PER = .2185*mpg + 8.3917

The test group, which has both the higher slope and y-intercept, will slightly outperform the control group. But not by much. The average player who saw 40 mpg, will see a .20 increase in PER, which is negligible. In other words, the test group has neither exceeded nor fallen short of our expectations, but rather has met them.

In the end what does this prove? Specifically this study removes the correlation between the role player group and players that saw extra minutes due to improvement. It debunks the thought that there is some kind of division between per minute stats, where the per minute stats of high minute players are more a representation of actual talent than those who play few minutes per game. But combined with the past works of Hollinger, Kubakto, and Ziller, among others, it makes an overall stronger statement. Players who receive 10 or more minutes per game are likely to keep the same per minute stats no matter what the increase in playing time is. Therefore per minute stats remains far superior to per game stats in terms of comparing and evaluating players.


EXTRAS:

  • “It’s a pretty simple concept, but one that has largely escaped most NBA front offices: The idea that what a player does on a per-minute basis is far more important than his per-game stats. The latter tend to be influenced more by playing time than by quality of play, yet remain the most common metric of player performance.” — John Hollinger
  • The great thing about this study is that I can perform it again, this time using the “odd” aged players as the control and the “even” aged players as the test group. This time the prediction equation was PER = .2039*mpg + 8.4439. And again our test players slightly outperformed the average. This time 192 did better than their expected PER, while only 161 did worse. On average each player gained 0.23 PER.
  • This article doesn’t mean that every player that has good per minute stats should see more playing time. It’s very clear that basketball stats don’t capture a player’s total ability. A player that does well on a per minute basis may have other flaws, such as poor defense, which prevent him from contributing more. This also isn’t an endorsement for any single per minute ranking system, like PER, WOW, etc. There are flaws in each of these in addition to being unable to account for attributes not captured in box scores.
  • Summary of the events that led to this article.

Back in 2005, I wrote an article outlining some of the pioneers in per minute research.

In the 2002 Pro Basketball Prospectus John Hollinger asked and answered the question ?Do players do better with more minutes?? For every Washington player, Hollinger looked at each game and separated the stats on whether or not he played more than 15 minutes. He found that when players played more than 15 minutes, they performed significantly better than when they played less. To check his work, he used a control group of 10 random players, and each one of those improved significantly as well.

The knock on Hollinger?s study is the small sample size, containing less than 25 guys from only one season. Enter Justin Kubatko, the site administrator of the NBA?s best historical stat page www.basketball-reference.com. Earlier this week Justin decided to re-examine the theory using a bigger sample size. Taking players from 1978-2004, he identified 465 that played at least a half season and saw a 50% increase in minutes the year after. Three out of four players saw an increase in their numbers as they gained more minutes, although the average increase was small (+1.5 PER).

Two independent studies have shown that NBA players get better when they get more minutes. A conservative interpretation is that per-minute numbers are universal regardless of playing time. So if a player averages 18 points per 40 minutes, he?ll do about that regardless of how many minutes he plays. A more liberal summary would say that underused players will see an improvement in their per-minute numbers if given more court time. A player that only averages 20 minutes a game is likely to be a little bit better if given 35. So the straight dope is per minute stats are a fantastic way to evaluate NBA players.

Recently, this research was questioned by the writers of freedarko.

The problem with this line of reasoning is that it assumes the homogeneity of court time. It assumes that if a player scored 20 points in 20 minutes, he would also score 40 points in 40 minutes. That there will by systematic differences between these two situations is almost too obvious to point out. It’s the difference between sharing the ball with Jordan Farmar while being guarded by Kenny Thomas, and sharing the ball with Kobe Bryant while being guarded by Ron Artest.

Insofar as the problem here is one of rotation, small-scale adjustments in minutes played shouldn’t create major distortions (it isn’t unrealistic to think that if Tim Duncan played 5 extra minutes per game, his per-minute production, as influenced by the level defense he’d face, would basically be the same). But when PER catapults bench players into the starting five (or vice-versa), be on the look-out for inflation. Call this the Silverbird-Shoals Hypothesis, or the THEOREM OF INTERTEMPORAL HETEROGENEITY (TOIH).

Enter Sactown Royalty’s Tom Ziller, to refute Free Darko’s theory.

Shoals and Silverbird are arguing that because low-minutes high-PER guys typically play against fellow bench players, their PER is higher than it would be if they played starter minutes. They aren’t arguing (as some surmised) that PER is useless, just that it is prone to inflation. The argument, from seemingly everyone on the ‘anti per-minute statistics’ side, is that if you increase a player’s minutes, his efficiency will suffer.

There’s a problem with this oft-repeated claim: It’s not true.

Thanks to the data-collection efforts of Ballhype’s own Jason Gurney, I’m going to try to ensure this claim never gets stated as fact ever again. Using seasons from 1997-98 to the present, we identified all players whom played at least 45 games in two consecutive seasons and whom saw their minutes per game increase by at least five minutes from the first season to the second. The players must have played between 10 and 25 minutes per game in the first season, to ensure we were not dealing with either folks who went from none-to-some playing time or superstar candidates who took over an offense and thus got a minutes boost. This is aimed at roleplayers whose role becomes more prominent — exactly the candidate FD’s Theorem of Intertemporal Heterogeneity implies will suffer from increased minutes.

Since I seem to express myself more clearly via Photoshop, here is the result of our mini-study.

No, increased minutes do not seem to lead to decreased efficiency. In fact, the data indicates increased minutes lead to… increased efficiency. More than 70% of the players in the study (there were 251 in total) saw their PER (which is, by definition, a per-minute summary statistic) increase with the increase in minutes. Players whose minutes per game increased by five saw an average change of +1.38 in their PER. The correlation between increased minutes and change in PER in this data set was +0.20.

One step further: Players who had at least five years of experience including their first-season in this study and got the requisite 5-minute increase (106 such players) saw an average change of +1.26 in their PER. It’s not just young kids who happen to improving and getting more minutes all at the same time — vets who get more minutes typically see their per-minute production rise. A full 67% of these players so positive changes in PER with the increased minutes. (And this answers one of Carter’s concerns with existing studies.) Let’s bump this up to players who had at least eight years of experience going into their minutes increase; we had 52 such cases. The average change in PER: +1.31. Of these players, 69% saw their PER increase with more minutes.

Case closed right? Well not if Brian M. has something to say about it.

Imagine we wanted to test the relationship between duration of exercise and reports of fatigue. We have two experimental conditions, one group jogs for 10 minutes and the other for 30 minutes. We predict that the group that jogs 30 minutes will report more fatigue.

But we must assign people to the two groups randomly in order for the data to have any bearing on the hypothesis. If we systematically assign people who are in better shape to the 30 minute jogging condition, we may find that in fact, if anything, people report less fatigue with longer durations of exercise. But the study is flawed in a fundamental way and so the data don?t tell us much of anything. At most what the results of this poor experiment tell us is that the effect of exercise duration on reported fatigue is not so strong that it overrides the differences in health between the two groups. But that is a really limited conclusion, especially if we don?t even have means to quantify how much the two groups differed in health to begin with.

Trading David Lee for Kobe Bryant Straight-Up: Shrewd Sabermetrics or Laugh Test Flunkie?

In Basketball on Paper, Dean Oliver devoted an entire chapter to comparing the individual rating systems of several NBA analysts. He argued something that I, and most people who do informed analysis, subscribe to: Any system of statistical analysis cannot only be internally consistent, but must also pass the “laugh test.” A statistical model can be built elegantly and beautifully and pass many confidence intervals within its own logical parameters, but if it’s results are absurd, then there’s obviously a need to return to the proverbial drawing board. Oliver thought of the “laugh test” as a litmus. It’s a very broad, absolutely basic determinant of whether a statistic is logical or not. If your rating system projects the best players with the best numbers, then it’s probably onto something. On the other hand, if your rating system argues that Jerome James is a better center than vintage Shaquille O’Neal, then you better recheck your assumptions.

While no single computation can perfectly encompass the entire contribution of a basketball player, John Hollinger developed a system to sum up a player’s boxscore contribution and express them in one number. Player Efficiency Rating (PER) is a sophisticated equation that goes so far as to adjust for the yearly value of possession and the pace a team plays. In Hollinger’s analogy, PER serves as a way of considering players from different positions, allowing an “apples to oranges” comparison. But while PER is a handy little number, what it doesn’t do is convert statistical efficiency into actual wins. That’s where Dave Berri’s Wages of Win (WoW) steps in. WoW takes the same boxscore statistics that PER uses and converts it to a formula that measures how many wins a player produces. This metric can evaluate a player’s total contribution over the course of a season and break it down per minute. Like PER, WoW serves as a way to summarize a player’s contribution in one number.

Now, let’s ask PER who were the most productive basketball players on the planet this past season. PER picks these as its starting five:

1. Dwyane Wade SG 29.2
2. Dirk Nowitzki PF 27.9
3. Yao Ming C 26.7
4. Tim Duncan C 26.4
5. Kobe Bryant SG 26.3

Nothing to laugh at here. In fact, it’s a pretty amazing team. Wade is the best player, slightly ahead of Dirk, who is just a bit ahead of Ming, Duncan, and Bryant, who are in a dead heat for third best. If you were starting a basketball team and were given first pick at any player in the NBA you couldn’t go wrong by picking any of these five players. They’re the best of the best. Granted, PER isn’t intended to be the final word on basketball performance, but it is a good starting point for figuring out relative worth. Would you trade your 15 PER performer for a 29 PER man? Almost certainly. Of course you’d take into account team composition, need, age, defense, contract terms, but all else being equal, you’d be doing your team a service by having the greater PER over the lesser. And if the PER was almost twice greater, like say Dwyane Wade over Jamal Crawford, well, then there’s really no thinking involved. Of course you’d rather have Wade. It’s a no-brainer. In fact, by this measure, you’d rather have Wade than any single player on the Knicks current roster.

Now, WoW gets to pick its own top five. Note that in order to compare WoW to PER we’re using Wins Produced per 48 Minutes (WP/48), since these are both rate stats:

1. David Lee PF .403
2. Jason Kidd PG .403
3. Marcus Camby C .371
4. Shawn Marion F .370
5. Carlos Boozer PF .351

Look at that again. David Lee led the NBA in wins produced rate. Um…really. So according to this sophisticated, statistical model, the most productive professional basketball player on the planet is David Lee. The best. On. The. Planet. Let me say that being a die-hard Knicks fan, I will be the first to argue that Lee is an All-Star caliber forward. He’s cool, he’s great. He’s an out-of-the-box rebounding, ambidextrous-finishing, no-look passing, efficiency machine. He’s awesome! It’s just that, you know, he really doesn’t create much offense. He’s more of a great glue guy than a centerpiece. And that’s why he’s not exactly a superstar.

Now, I really love the guy. Don’t get me wrong. I wouldn’t trade our man for the world. Oh, wait. Yes. Yes, I would. I’d trade David Lee in a heartbeat. For Tim Duncan. Or Yao Ming. Or Dwyane Wade. Or Kobe Bryant. Or Dirk Nowitzki. Or Lebron James. Or Amare Stoudemire. Or…OK, you get the point. I’d trade him for at least a dozen players who aren’t just All-Stars, they’re legitimate championship-level franchise cornerstones. Yet, right there in plain black and white, Wages of Win’s assumptions fail Oliver’s “laugh test.” WoW argues that Lee is the best player in the entire league, and that’s ridiculous.

WoW makes a very big deal about bucking conventional wisdom. And sure enough, statistical analysts are the ones who’re supposed to be bucking said conventional wisdom. At the Wages of Wins Journal, Berri argues that “perceptions of performance in basketball do not match the player’s actual impact on wins” because “less than 15% of wins in the NBA are explained by payroll.” However payroll isn’t a good measuring stick of perception due to the complexities of a closed system like NBA free agency. There are a host of factors on why a player may be overpaid from the talent available to the desperation of the team involved. In other words conventional wisdom thinks Rashard Lewis is overpaid at $126M, too.

So although conventional wisdom has a tendency to be wrong in some areas, figuring out sport superstars is not one of its weaknesses. There usually is a consensus on the league’s best players from both statistical analysis and conventional wisdom. The cream of the crop in the NFL are Peyton Manning, LaDanian Tomlinson, and Larry Johnson whether you go by the numbers or eyes. In MLB it would be Albert Pujols, Ryan Howard, Manny Ramirez, David Ortiz, Alex Rodriguez, and Johan Santana. At the top of the ladder of player evaluation, conventional wisdom is pretty much dead on.

According to WoW, David Lee (.403) is a far more productive player than Kobe Bryant (.242). Since teams with more productive players win more games than other teams, then Lee is better for your basketball team than Bryant. But why stop there? The Knicks could trade Renaldo Balkman (.272) straight up for Dwyane Wade (.255) and lose productivity. That’s right. WoW is arguing that if a Lee for Kobe, and a Balkman for Wade trade went through, then the Knicks would be a worse team for it. They’re arguing that Bryant and Wade, at the cost of our two young, talented forwards will hurt the Knicks’ productivity. You’ve got to be kidding me.

As the Knicks GM, would I pull the trigger on a Lee for Bryant deal? Is there even a debate? Who wouldn’t? Oh, right, WoW wouldn’t. WoW doesn’t even think it’s close. We can all disagree on which player is the very best (or the most productive), but WoW’s results are “laughable.” Dave Berri has criticized PER in the past, but before people can begin to take WoW as seriously as a tool for evaluating player performance as PER, it’s obviously going to have to address what caused this terrible absurdity in its rating process.

Draft Analysis By The Numbers

With the 2007 NBA draft almost upon us, there’s plenty of resources around the web for those craving more information regarding the draft. However I’ve stumbled across three that I thought were particularly interesting. The one thing all of these resources have in common is that they offer a statistical look at predicting incoming NBA players. For some time baseball fans have had a good amount of knowledge on what makes a good professional. College pitchers generally fared better than high schoolers. Minor league pitchers that had a good BB:K and HR:K ratios were more likely to succeed than those who didn’t. In the NFL, footballoutsiders discovered that drafted college QBs who had the most starts and the highest completion percentage did better than the rest of the field.

The first is probably the least well known. HoopsAnalyst has run a 4 part series (hopefully to be a 5 part series) on what stats are most important for aspiring professionals. Ed Weiland has unearthed a few interesting gems. Scoring quantity for shooting guards is more important that scoring efficiency. Also important for shooting guards is those that do better in “athletic” stats (per minute rebounds, steals, and blocks). The reasoning is that players who aren’t physically gifted enough don’t do well at the next level (Shawn Respert, Trajan Langdon, Jarvis Hayes and Reece Gaines). Weiland lumps together college players and international ones. Other than Oden & Durant, Weiland sees a bright future for Horford, Noah, Rudy Fernandez, Wright, and Green.

Next is the WoW Journal, with guest writer Erich Doerr. In his approach, Doerr attempts to apply Berri’s Win Score method to the amateur players. Using this method, the sleepers of the draft appear to be Nick Fazekas, Stephane Lasme, and Rashad Jones-Jennings from the college ranks and Jianlian Yi, Marco Belinelli, Luka Bogdanovic, Jonas Maciulis, Kyrylo Fesenko, and Mirza Begic from the international ranks.

Last but not least, John Hollinger has published his method for digging up potential prospects. Hollinger concentrates on college players and adjusts for both strength of schedule and pace. Like Weiland, Hollinger finds such “athletic” stats as steals, blocks, and rebounds to coincide with future success. His system also adds age, three point shooting, height, and passing (ppr). Good news for (probably) Seattle fans: Durant looks to be the best prospect of this decade. Thaddeus Young, who both Wieland and Doerr are lukewarm on, makes Hollinger’s top 5, along with Oden, Conley, and Wright.

While all three methods don’t always agree, there are a few players that there is a consensus on. Oden and Durant are the obvious examples, but also Brandan Wright, Al Horford, Nick Fazekas, and Joakim Noah on the positive side, and Acie Law, Corey Brewer, and Nick Young on the negative side. But more importantly, it’s great to see that there are a few different people looking into projecting future stars. I guess only time will tell if any of these systems bear fruit.

[NOTE: Apologies to Bret at Hoopinion, who also took a statistical look at this year’s draft class as well. At this moment there are 10 articles posted, with some good tidbits there.]