Similarity Scores, Part 1

Kobe Bryant is the next Jordan. Dwight Howard is the next Alonzo Mourning. Mardy Collins is the next Jason Kidd. Comparing two players allow us to communicate lots of information with a few words. If someone says that LeBron James is like Oscar Robertson, you would imagine LeBron being strong, versatile, agile, great, etc. Or perhaps that’s how you might picture the Big O, depending on how old you are.

Comparing two players is also useful when you’re evaluating players. Find a historical player similar to a youngster, and you have a good idea of how he might develop. However identifying similar players can be difficult and subjective. Is LeBron the next Jordan, Magic, or Robertson? In order to take some of the guesswork out of the equation, I’ve created a similarity score using statistics. Since per-game and accumulated stats are dependent on playing time and don’t adequately reflect a player’s skill level, I’ve decided to go with standardized (z-scores) per minute stats. Originally I used just about every stat the NBA officially keeps track of, but the results didn’t pass the smell test. It didn’t make sense for personal fouls to be worth the same as points. Therefore I decided to use weighted stats, and broke them into three categories.

The first and most important category is scoring. No other historically recorded statistic is more integral to a player’s worth. Some players are expected to run the offense and have a high number of assists, while others are on the floor primarily to rebound, but few do both. However just about everyone on the court is expected to score at some point or another. Even players that score infrequently or inefficiently should be more similar to those of the same ilk. Hence I made scoring worth approximately half a player’s comparison score.

Originally I had added many aspects of scoring, but I found that they tended to take away from the main focus: efficiency and volume. Oddly I also saw better results when I limited scoring to just three stats: TS%, eFG%, and PTS/36. Since the first two are compilations of different aspects of scoring, I feel justified leaving things out like free throw percentage or three pointers attempted. And the results seemed to get better when I gave more priority to the percentages, and less to points. This is due to a wider variety in efficiency than volume. Lots of players can average 20pts/36, but few can do it at 60% TS%. Currently TS% and eFG% are both worth twice as much as PTS/36.

I split the rest of the stats into two sections which I call (for lack of better terms) “Small Man” and “Big Man”. “Small Man” is worth about a third and consists of three parts: AST/36, STL/36, TO/36. I found that assists tend to separate contrasting players better, and ranked it equal to the other two combined. “Big Man” is worth about a fifth and is OREB/36, DREB/36, BLK/36 and PF/36. Rebounding combined (but not individually) is more valuable than blocks, and fouls are minuscule, but present.

In the end, I’ve come up with a system that although has subjective elements, should provide objectivity across the board. The similarity scores use the same equation for every player, so there isn’t any bias in that respect. In other words I could try to make Jamal Crawford more similar to Michael Jordan, but that would likely make other players that are more close to him get even closer. In future I may tweak the weights, but essentially the process is the same.

Since I plan on adding these to the report cards, let’s start with the guy I missed, Chris Duhon’s 2009 season compared to others at the age of 26.

0.000 Chris Duhon G 2009 NYK 79 12.2 .570 .515 10.9 3.0 7.0 0.9 2.7
0.044 Vinny Del Negro G 1993 SAS 73 13.9 .563 .514 12.8 3.8 6.9 1.0 2.2
0.052 Brad Davis G 1982 DAL 82 14.5 .569 .524 13.7 3.1 7.0 1.0 2.2
0.096 Steve Henson G 1995 POR 37 12.1 .613 .564 11.3 2.5 8.1 0.9 2.8
0.101 Vern Fleming G 1989 IND 76 15.8 .572 .517 15.3 4.4 7.0 1.1 2.7
0.105 Rex Walters G 1997 PHI 59 13.0 .571 .543 13.9 3.7 3.9 1.0 2.1
0.107 Jacque Vaughn G 2002 ATL 82 13.1 .547 .498 10.5 3.3 6.8 1.3 2.2
0.116 John Crotty G 1996 CLE 58 13.0 .590 .482 10.0 3.2 6.0 1.3 3.0
0.117 Luke Walton F 2007 LAL 60 14.7 .551 .517 12.4 5.5 4.7 1.1 2.1
0.120 Sherman Douglas G 1993 BOS 79 13.5 .518 .504 11.5 3.0 9.5 0.9 3.0
0.121 Phil Ford G 1983 TOT 77 10.4 .525 .480 11.7 2.3 6.5 1.2 3.0

The first thing to notice is the z-sum table, which is the similarity score. The lower the number this is, the more similar the players are. Duhon is most similar to Del Negro and Davis, with a drop off to Henson & the others. So what does something like this tell us about Duhon? Looking over the list we see lots of mediocre players and no All Stars. So the chance that Duhon will develop into something superior to his current form is rare. As for the comparables, in two of the next three years, Del Negro would have his most productive seasons. And much like Duhon, Davis languished as a reserve before catching on in his 26th year. He would become the starter for the Mavericks, and ride out a few bad seasons until the team turned things around in the mid-80s.

Stay tuned for Part 2…

Mike Kurylo

Mike Kurylo is the founder and editor of His book on the 2012 Knicks, "We’ll Always Have Linsanity," is on sale now. Follow him on twitter (@KnickerBlogger).

21 thoughts to “Similarity Scores, Part 1”

  1. Hey Mike,

    New to the blog but definitely enjoyed it so far. Can you run this analysis on some of the top players in the NBA – Kobe, LBJ, Wade, etc.? Also, would like to see the projections for David Lee and Nate – may give us a better perspective on their individual worth.

  2. Anybody think the Knicks should go after Mehmet Okur, who appears likely to opt out, rather than keeping Lee?

  3. “Anybody think the Knicks should go after Mehmet Okur”


    He’d be opting out of a $9 million paycheck next season, presumably because he can make more money. He’s 30 years old. He is not the player the Knicks need to build around.

    That said, why would he opt out? Even Boozer, considered the cream of the FA crop this off-season, isn’t sure whether opting out is such a good idea. No one has cap space, and those teams that do don’t want to spend the money. If he knows Detroit wants him, then go for it, but to opt out for the mid-level exception is probably not a great business move. In 2010 just about every team will have cap space to spend irresponsibly. He should wait until then.

  4. I’d be interested in seeing how similar DLee’s numbers are to other players like Paul Millsap, Linas Kleiza and Troy Murphy. Millsap, for example, is waiting to see what kind of deal Lee gets, and expects to get something similar. Millsap blocks more shots, but is foul prone. I’d take Lee any day over Millsap, but it will be interested to see how similar their deals are at the end of the day.

    Are there any UFA shooters that the Knicks could target for a short term deal? I see that Morris Almond is an UFA. The Knicks were very interested in him a couple years ago, and a lot of the mocks had the Knicks selecting him. He was supposed to be a lights-out shooter, with many comparing his shot to Houston’s. I think he played in the D League for a bit. His NBA numbers have been less than impressive, however.

  5. How about comparing Yao to Sam Bowie (or to be generous, Bill Walton?) Just saw on MSNBC that Yao’s broken foot will keep him sidelined all of next season and possibly beyond. What a disaster for the Rockets.

  6. so how long will it take for some dopey sports journalist to title the Yao-Ming story: “Houston, we have a problem”?

  7. Great work, Mike. This is a great way to use the numbers – to try and get beyond the images of players, that may or may not be accurate.

    For example, this is not at all the group I would have associated with Duhon. Vinny Del Negro? Luke Walton?

    I guess I think of Duhon as, for better or worse, a pure point who made a place in the league by playing defense. This may underplay his defensive value — if I remember right, Del Negro was a gaping hole on D – but this makes a good, objective point that Duhon’s offensive role is more of a pure shooter, like the combo guards and even forwards here.

    It’s a great way to take a fresh look at people – looking forward to the rest of the week.

    p.s. Millsap is a pretty comparable player to Lee, and a much better defender. Not as much offensive game. I’d probably go with Lee, but not by much, and maybe not at all when you factor in the dollars…

  8. Anyone read the first installment of Bucher’s FA preview on For Lee, ranked at #18, he writes:

    “His game: Energy and defense without needing plays called for him. Will sacrifice his body on screens and charges. Undersized but athletic, hard-nosed and low-maintenance. Rebound and loose-ball fiend. Good hands and decent with putbacks and finishing around the rim off pick-and-roll. Not much of a threat beyond 15 feet or on post-ups. Willing help defender, but not a shot-blocker.

    Right system: Up-tempo is ideal because he’ll outrun most bigs in transition. Need at least three scorers, ideally four, so he has room and reason to chase down rebounds and putbacks. Can’t play off an offensive post threat because he doesn’t have the jumper to space the floor. Mobile enough to show on the guard in pick-and-roll defense and get back to a rolling big.”

    Defense? Only “decent” with putbacks and finishing around the rim on pick and rolls? Yeah, okay.

  9. Almost everything in that preview sounded like total nonsense.

    I did think it was interesting to see him report the Lakers’ interest in Nate Robinson. I wonder if a Nate for Ariza trade would make sense for anyone. Here’s guessing they’ll both be juggling mid-level offers..

  10. Trevor Ariza: The Return.

    Sometimes I wonder if the average sports writer actually watches the same games as everyone else.

  11. “Need at least three scorers, ideally four, so he has room and reason to chase down rebounds and putbacks.”

    He’s played very well on the Knicks over the years without too many (any?) good scorers. If the scorers are good then their shots would go in and he’d get less offensive rebounds and putbacks. So it stands to reason that Bucher thinks the ideal team for David Lee is a team with a bunch of inefficient scorers. Fast-paced with a bunch of inefficient scorers… sounds a lot like the early 2008/9 Knicks.

    Who are the 17 players ranked ahead of Lee, btw???????

  12. I am honestly flabbergasted at Bucher’s list.

    Iverson, while it’s moronic, at least it is to be expected from dumb sportswriters. Gooden, though – wow.

  13. Re: Gooden

    Odds are, if you’ve been traded away from Cleveland since Lebron came, you’re not good.

