Thursday, October 18, 2007

How bad was Cincinnati last year: Pythagoras weighs in

A while back, baseball researcher Bill James developed a simple formula to estimate winning percentage for a team (called pythagorean record or expectation). The formula is still around today (although there have been some minor improvements) and is right here:

(runs scored^2)/(runs scored^2)+(runs allowed^2)= winning percentage

Quite simply, then, the relationship between runs scored and runs allowed is a very good indicator of what a team's record should be. Generally, large variations off this are considered to have a large degree of luck involved. For example, when a baseball team scores, say, 800 runs and allows 800 in a year, you'd figure they'd be around a .500 team. If they go 90-72 it's safe to say that some of their success was indeed luck (mostly as in performing well in close games .. which inherently involves more luck and randomness).

This concept can be applied to other sports as well. In college basketball, the only thing that changes is the exponent -- it appears it needs to be changed to 10 instead of 2. Also, because of the imbalance in competition, most analysts only calculate pythagorean record with conference games only. With only 16 in conference games (in the Big East) that doesn't leave us with a very large sample from which to test. However, it can shed some light on which teams underperformed and which ones over performed.

Here's a spreadsheet I made with BE pythagorean records from 06-07.

From left to right, we've got points for, points against, point differential, wins (just plain old in-conference wins), pythagorean winning percentage, pythagorean wins, regular wins minus pythag wins.

The three "luckiest" teams were:
St. Johns: + 2 wins
Marquette: +1
Depaul: +.6

Cincinnati: -1.5
Uconn: -1.1
Seton Hall: -1

You can see there isn't a large difference between pythagorean wins and regular wins, overall. It does it's job. Also, the varitation within 16 games is certainly more limited (than in baseball, or the nba, or so on) because of the shortage of games.

Nonetheless, it's good news that UC was unlucky, I suppose. They "should" have won 3.5 games instead of the measly 2 that they picked up. That's still of course not good, but it's a better starting point than 2 is for next year.

We can also see how the BE kind of split into four sections last year. You had the top four clearly defined with GU, Louisville, Pitt, and ND. The good but not great pack ranging from Syracuse to Marquette. The bad but not horrible guys -- Uconn, St Johns, Seton Hall. Depaul was pretty much average. And then the trailers with UC, Rutgers, and South Florida.

With the large turnover in rosters, this is limited of course in future projections. However, it gives us a place to build on that is probably more reliable than basic wins and losses. Now, Cincinnati has to improve 4-5 games to get into the .500 range instead of 6. Of course, there are things that can be further explored like "clutch shooting", playing well late in games, etc. that could essentially allow a team to beat pythagoras and use skill, rather than luck to win more (or less games) then expected. My expectation is that these things don't really exist, or aren't real transferable skills. However, I haven't seen research in basketball on those topics. Then again, I've admittedly not followed basketball research closely over the last few years. Well get into that stuff.

No comments: