Monday, October 22, 2007

The basis of this basketball analysis

As I mentioned in my first post, I am no expert in advanced basketball analysis. It appears, however, that much of the analysis is based off one important concept: estimating possessions.

Estimating possessions is important because it is a way to put teams on equal footing, so to speak. Let's just take a random example with numbers for illustration only. One offense gets about 67 possessions a game and they score 63 points a game. Another team gets 77 possessions a game and scores 72 points per game. Now, if you're looking at the basic stats, it looks like the second offense (the one that scores 72 a game) is clearly better. However, once you realize that they receive more possessions, i.e., more opportunities to score per game, you see that they really aren't the better offense. The first team scored .94 points per possession, while the second one scores .935 points per possession. In the end, they're of course very similar, but the first team is probably a little better offensively (at least they're more efficient). It's the style of the second team (more fast paced) that allows it to score more point per game.

On Ken Pomeroy's site, this is the definition he gives for estimating possessions for teams: field goal attempts - off. rebounds + turnovers + .475*free throw attempts. Doing this with Cincinnati in 2007 -- they had about 1,977 possessions in 30 games, or about 66 a game. That's right around the the ncaa average of 67. (For some reason, he gets 65 possessions per game for Cincy, while I get 66. I believe it may have something to do with adjusting for overtime games, but I'm not sure. Anyway, it's not a huge deal, but I may look into it in the future.) That put Cincy at about 98 points per 100 possessions, which ranks them 236 out of 336 division I teams. However, after Pomeroy adjusts that number (Im guessing for competition, i.e., playing in the big east), they end up ranking 138th in offensive efficiency. That's another story, though.

And, of course, it doesn't stop with points. You can look at blocks, rebounds, turnovers, etc. per possession and adjust for a team's style of play. Is Air Force really great at not turning the ball over year in and year out (I don't even know if they are -- I'm assuming their TO numbers are relatively low), or is it just because they get 10-15 less possessions per game than the average team? Once you look past the regular numbers and adjust for a team's context, you can get a better feel for how they play the game. A lot of the analysis on this blog will be based on per possession stats.

Thursday, October 18, 2007

How bad was Cincinnati last year: Pythagoras weighs in

A while back, baseball researcher Bill James developed a simple formula to estimate winning percentage for a team (called pythagorean record or expectation). The formula is still around today (although there have been some minor improvements) and is right here:

(runs scored^2)/(runs scored^2)+(runs allowed^2)= winning percentage

Quite simply, then, the relationship between runs scored and runs allowed is a very good indicator of what a team's record should be. Generally, large variations off this are considered to have a large degree of luck involved. For example, when a baseball team scores, say, 800 runs and allows 800 in a year, you'd figure they'd be around a .500 team. If they go 90-72 it's safe to say that some of their success was indeed luck (mostly as in performing well in close games .. which inherently involves more luck and randomness).

This concept can be applied to other sports as well. In college basketball, the only thing that changes is the exponent -- it appears it needs to be changed to 10 instead of 2. Also, because of the imbalance in competition, most analysts only calculate pythagorean record with conference games only. With only 16 in conference games (in the Big East) that doesn't leave us with a very large sample from which to test. However, it can shed some light on which teams underperformed and which ones over performed.

Here's a spreadsheet I made with BE pythagorean records from 06-07.

From left to right, we've got points for, points against, point differential, wins (just plain old in-conference wins), pythagorean winning percentage, pythagorean wins, regular wins minus pythag wins.

The three "luckiest" teams were:
St. Johns: + 2 wins
Marquette: +1
Depaul: +.6

Cincinnati: -1.5
Uconn: -1.1
Seton Hall: -1

You can see there isn't a large difference between pythagorean wins and regular wins, overall. It does it's job. Also, the varitation within 16 games is certainly more limited (than in baseball, or the nba, or so on) because of the shortage of games.

Nonetheless, it's good news that UC was unlucky, I suppose. They "should" have won 3.5 games instead of the measly 2 that they picked up. That's still of course not good, but it's a better starting point than 2 is for next year.

We can also see how the BE kind of split into four sections last year. You had the top four clearly defined with GU, Louisville, Pitt, and ND. The good but not great pack ranging from Syracuse to Marquette. The bad but not horrible guys -- Uconn, St Johns, Seton Hall. Depaul was pretty much average. And then the trailers with UC, Rutgers, and South Florida.

With the large turnover in rosters, this is limited of course in future projections. However, it gives us a place to build on that is probably more reliable than basic wins and losses. Now, Cincinnati has to improve 4-5 games to get into the .500 range instead of 6. Of course, there are things that can be further explored like "clutch shooting", playing well late in games, etc. that could essentially allow a team to beat pythagoras and use skill, rather than luck to win more (or less games) then expected. My expectation is that these things don't really exist, or aren't real transferable skills. However, I haven't seen research in basketball on those topics. Then again, I've admittedly not followed basketball research closely over the last few years. Well get into that stuff.

Friday, October 12, 2007

Some simple adjustments to some simple stats

Forget about me saying that I wouldn't make any posts until the season starts ...

Anyway, the first problem about the stats printed on many websites is that they are based on per game averages. That's all fine and dandy, but how fair is it to compare one guy to another when one guy plays 20 minutes a game and the other plays 35? So, I threw the stats on the UC website into a spreadsheet and quickly calculated some averages per 40 minutes, instead of per game.

Here's the spreadsheet from google docs

And here are the top 5 in a few categories (at least 200 total minutes)

Points per 40 minutes
Vaughn 17.6
Williamson 17.5
Sikes 12.9
McGowan 11.6
Gentry 11.5

Rebounds per 40
Williamson 9.5
McGowan 7.1
Sikes 6.8
Barwin 5.3
Warren 5.1

Steals minus turnovers per 40
Gentry -.11
Barwin -.67
Vaughn -.73
Warren -.78
Crowell -.96

That last "stat" I just made up. A positive number is of course better, as you want to have more steals than to's if possible. I figured that if you fault someone for making turnovers you also have to credit them for creating to's. A guard may have more opportunities for more to's, but he also has more opp's for steals, so it makes it somewhat equal between guards and big men. Anyway, take it with a grain of salt ...

It's pretty surprising how good Gentry was in that category, though. He averaged 1.8 TO and 1.7 steals per 40 minutes. He was the only Bearcat even close to an even 0. Also, check out his in conference numbers (1.7 steals, 1.4 to's) ... he actually improved against Big East teams. Actually, for some reason, most UC players improved in this category versus BE teams ... kinda strange. Gentry, though, is the only one to have a positive numbers.

I calculated effective field goal % (eFG%), explained here. The formula is (fg + .5*3P)/fga. It simply adjusts for the fact that a 3 pointer is worth one more point than a regular field goal. Here's the top 5:

Williamson 50%
Sikes 50%
Crowell 49%
McGowan 47%
Vaughn 45%

Thursday, October 11, 2007


Well, I was just looking for some Bearcats blogs to catchup with the recruiting and the outlook on the 08 team. Turns out, I couldn't really find any active ones ... maybe I'm not looking in the right place, but I was pretty surprised, nonetheless.

As for me, I've been a fan of UC hoops for about a whopping 6-7 years. I can't really remember the exact moment I started following them, but I believe it was the night of a Xavier game (where they lost a close one) early in the 00-01 season. The next year I pretty much followed every game and was really into the team. My favorite player is/was Steve Logan and my favorite "moment" was the win against Marquette in 01-02, where Logan hit a 3 late and then dished off to Donald Little for the win moments later.

Anyway, there are a couple of reasons why I put "outsiders perspective" under the title. For one, I don't go to/didn't go to Cincinnati. I live miles and miles away and have only witnessed one game live (a win in Syracuse two years ago). I'm not a basketball expert at all and I don't really keep up with recruiting like I should (or really at all). So, if you come upon this blog looking for some Bearcats stuff, you may get it, but there's a very, very good chance you'll know a lot more about the team than me. Keep that in mind.

You might wonder why I'm making a blog then ... Well, like I said, there really isn't one out there that's active (outside of Katzowitz's, which is football too ... but seems pretty good). Also, I kind of like the "sabermetric" side of sports analysis (specifically baseball) and I figured I could learn a little about how that applies to basketball here on this blog. And, make no mistake, I'm a pretty big fan. I watch every game that's on and listen to quite a few on the internet/radio feed.

As for this blog, if you do happen to stumble upon it, don't expect too much until around the start of the season. If I get some time, I may take a look back at some things last year, but I'm not sure how many posts I'll make here in these first few weeks. I'll put a bunch of links up on the sidebar in the coming days, both for me to catch up on and for you to browse if you haven't checked them out.

Thanks for stopping by. Feel free to leave a comment, especially as I get this thing going. It's going to be a big bounce back year ... right!?