By The NumbersHockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.

Is there an equivalent of a "Moneyball" for the NHL?

Is there an equivalent of a "Moneyball" for the NHL?

The core thesis of the book was that the conventional wisdom in baseball for evaluating players had created inefficiencies in the marketplace that led to some players being undervalued. By taking advantage of those inefficiencies by plowing through undervalued stats, a GM could build a competitive team while staying within a budget.

The same guy who wrote Moneyball - Michael Lewis, wrote a story about NBA player Shane Battier in the New York times a while back that sort of exposed the lack of good defensive stats in the NBA. Shane Battier in many ways is a basketball player that defies conventional because he doesn't load up on the stats we would think of when we think good basketball player. In short, there are also inefficiencies in the market for pro basketball players - particularly defensive ones because there are no reliable stats to measure their performance. You can find it here: http://www.nytimes.com/2009/02/15/ma...Battier-t.html

So that got me to thinking, there's got to be some inefficiencies in the market for hockey players that create undervalued hockey players right? Anyone want to take a shot at this question?

When Mike Gillis took over the GM position of the Canucks, he said from the beginning that he was going to closely follow the "moneyball" strategy in his management. He never went into any real detail, but he's taken the team into a playoff position so we'll see how it pays off.

I'm sure there are inefficiencies, but the biggest problem in finding them is the lack of raw statistical material to work with, especially with regard to players outside the NHL. For those guys, you won't find much more than scoring and penalty stats, and possibly plus-minus. Imagine Billy Beane and his staff trying to find college players the scouts had missed if the only stats he had to work with were, say, games played, hits, runs, and errors.

And it's not really likely to change that much - the cost and effort involved in getting lower-level leagues to record things like playing time, blocked shots, and so on is prohibitive.

There's also the fundamental constant-motion nature of the sport vs. a more static one like baseball. Moneyball talks about the guys who are developing better fielding metrics by breaking down things like where and how hard the ball is hit to. And that's useful because there's a limited number of other paramaters to consider in evaluating how a fielder responds to that ball.

But in hockey, everything one player does is relative to the other 11 guys on the ice. It's tough to derive useful stats when every action a player takes is subjective based on the circumstances around him. Let's say you decide to start recording dump-in recoveries by offensive players - is it reasonable to value them all equally, when the degree of difficulty of the recovery is so dependent on where the puck was shot from and went to on the dump, and how the defense was positioned?

So the development of hockey statistics generally continues to revolve around the processing of the handful of primary stats we have. And while there's some interesting work being done with them, there's really nothing as profound and practically useful as the stuff people like Bill James or Voros McCracken have come up with using the array of less-subjective baseball stats available.

take it for what it's worth but I'm actually onto something that's "moneyball" like or Sabremetrics rather like

I don't treat it the same way Bill James does it, although I'm working on similar ideas

my model uses some nonlinear equations, some linear equations and some statistical theorems; my problem is right now I don't have access to a lot of statistics, I spoke about it to a stats professor and he said he'd see what he can do to help me out

btw, a lot can be made only with games played and points as the previous poster pointed out; and by a lot, I mean most of the work

take it for what it's worth but I'm actually onto something that's "moneyball" like or Sabremetrics rather like

I don't treat it the same way Bill James does it, although I'm working on similar ideas

my model uses some nonlinear equations, some linear equations and some statistical theorems; my problem is right now I don't have access to a lot of statistics, I spoke about it to a stats professor and he said he'd see what he can do to help me out

btw, a lot can be made only with games played and points as the previous poster pointed out; and by a lot, I mean most of the work

My all-time favorite blogger/writer, Tom Benjamin, had a great discussion on this a couple of years ago. If you search through his blog for Poisson, you can find some other posts on the topic of statistical validity in a game that's generally governed by randomness; and even a discussion on the "clutch" factor. Here's an excerpt:

Quote:

The assumption is that goal scoring can be modelled by something called a Poisson distribution. That’s a fancypants way of saying that goals are random events even though they don’t meet the strict definition for randomness. Most fans have difficulty accepting that idea probably because it means that when we say that the good guys played well, we really mean that they were favoured by randomness.

This is not to say that ability does not count because it counts a great deal. What it means is that ability and try are more or less constant from shift to shift and game to game. The results achieved in any given shift or game, therefore, are determined by the external - essentially random - factors. I like to think of it as a coin flipping contest with the coin weighted in favour of the better player or team.

This assumption flies in the face of many hockey myths around things like clutch play and the apparent ability some players have to rise to the occasion. The fact is that somebody has to come through in the clutch and that somebody is randomly selected by the hockey gods. This idea makes us feel uncomfortable because it is disturbing to realize that so many things in life are beyond our control. It means hockey - indeed life - in the short run is about luck and probabilities. Skill only outs in the long run and even a season is a relatively short time period.

Make sure you read the reader comments as his mathematician friend responds to some reader queries and ideas.

(There's also a gaggle of Oilers bloggers who routinely dive into this stuff. MC79hockey, a huge fan of sabremetrics if I recall is one, and Irreverent Oil Fans @ http://vhockey.blogspot.com/.)

I hope this piques the interest of BOHB moderator Fourier....

My all-time favorite blogger/writer, Tom Benjamin, had a great discussion on this a couple of years ago. If you search through his blog for Poisson, you can find some other posts on the topic of statistical validity in a game that's generally governed by randomness; and even a discussion on the "clutch" factor. Here's an excerpt:

Thanks for that article. It's something that I've felt for a while now but it's nice to see it quantified in some way.

The question isn't "can we model hockey exactly we model baseball?"

The actual question is "can a statistical model of NHL hockey improve our organization's decision making" and the answer to that question is "yes" in my opinion.

Think of how high the error rate is in the NHL draft. If you could decrease you're organizations error rate 15% in the draft that would be a valuable thing in the long run.

One stat i would like to see added to hockey is a +/- combined with PM. How many goals have been scored with you in the box basically.

that seems like a useless stat for anyone but penalty killers. The player in the box has no effect on the penalty kill, the goalie's performance, or the PP's performance unless that player is a penalty killer.

i think people overvalue emotional impact. they drool over leaders and big hitters like dustin brown, and underrate guy like frolov who quietly are much better players and are going to benefit a team more.

The question isn't "can we model hockey exactly we model baseball?"

The actual question is "can a statistical model of NHL hockey improve our organization's decision making" and the answer to that question is "yes" in my opinion.

Think of how high the error rate is in the NHL draft. If you could decrease you're organizations error rate 15% in the draft that would be a valuable thing in the long run.

and that's exactly what I'm doing, the model proposed by James is well done for baseball, but you can't think of applying the same model to hockey since it would be very hard to do. However, with a few twists here and there and use of other stats tools you can get something decent. My draft success ranges from 80% to 100%, and that takes into accounts late picks and the percentages stand down to 1990, and it tells you to get exceptions like Martin St. Louis, Andrei Markov and Johan Franzen, so it's not only the early picks.

i think people overvalue emotional impact. they drool over leaders and big hitters like dustin brown, and underrate guy like frolov who quietly are much better players and are going to benefit a team more.

exactly, although dustin brown still figures well in my model, it's clear people overestimate certain aspects of the game like grit and whatnot in favor of a player who "does what he has to do" and you can clearly see that in drafts as well.

My all-time favorite blogger/writer, Tom Benjamin, had a great discussion on this a couple of years ago. If you search through his blog for Poisson, you can find some other posts on the topic of statistical validity in a game that's generally governed by randomness; and even a discussion on the "clutch" factor. Here's an excerpt:

Make sure you read the reader comments as his mathematician friend responds to some reader queries and ideas.

(There's also a gaggle of Oilers bloggers who routinely dive into this stuff. MC79hockey, a huge fan of sabremetrics if I recall is one, and Irreverent Oil Fans @ http://vhockey.blogspot.com/.)

I hope this piques the interest of BOHB moderator Fourier....

thanks for that!

like he says, it's hard for a hockey fan to accept some of the twists that maths and stats theory tells you to do, to often we're tempted to lie your observations in a proclus bed where you'd like to fit your data in a preconceived idea. It's much harder than doing regular maths where you simply don't care whether or not the sum of interior angles of a triangle are 180 degress, it could be 200 and you couldn't care less.

The question isn't "can we model hockey exactly we model baseball?"

The actual question is "can a statistical model of NHL hockey improve our organization's decision making" and the answer to that question is "yes" in my opinion.

Think of how high the error rate is in the NHL draft. If you could decrease you're organizations error rate 15% in the draft that would be a valuable thing in the long run.

Sounds good in theory, but what traits are the best predictors of NHL success in 17-18 yr olds? You don't know exactly how much more a guy will grow, or how strong he'll get. Skating is probably one good metric as a comparison amongst peers. If you assume that all teams take the best player available (and usually that is the case) based on their career stats and physical attributes-- you get what we have today. A crapshoot after the first 6 or so picks. I guess what I'm saying is that if it were possible to improve, someone would have done it by now. It's the only time teams have access to players for free, and it remains the foundation for building teams. Maybe it's not so much the drafting, but the development. I think Detroit's success has been in finding players that fit a certain style or possess a specific set of skills, who then are developed in a very specific way.

Back to the baseball vs hockey modeling. I think you're bypassing the question. Is hockey a sport that can be modeled?

I hope this doesn't get lost, but hockey is much more a game of random events. There's simply too much going on the ice with too many variables (number of players, conditions, etc.). Baseball is somewhat static in comparison to hockey.

Sounds good in theory, but what traits are the best predictors of NHL success in 17-18 yr olds? You don't know exactly how much more a guy will grow, or how strong he'll get. Skating is probably one good metric as a comparison amongst peers. If you assume that all teams take the best player available (and usually that is the case) based on their career stats and physical attributes-- you get what we have today. A crapshoot after the first 6 or so picks. I guess what I'm saying is that if it were possible to improve, someone would have done it by now. It's the only time teams have access to players for free, and it remains the foundation for building teams. Maybe it's not so much the drafting, but the development. I think Detroit's success has been in finding players that fit a certain style or possess a specific set of skills, who then are developed in a very specific way.

Back to the baseball vs hockey modeling. I think you're bypassing the question. Is hockey a sport that can be modeled?

I hope this doesn't get lost, but hockey is much more a game of random events. There's simply too much going on the ice with too many variables (number of players, conditions, etc.). Baseball is somewhat static in comparison to hockey.

the development of a 17-18 year old skaters tells a lot more than people believe ... for North Americans ... It doesn't tell the whole story for all skaters, but players who do much better than predicted by my model or people who do much worse is very rare.

I suspect the Red Wings to have such a model, at least for europe, because a lot of times when my model tells you to get a certain player, the Red Wings have picked him, from Jonathan Ericksson to Franzen to whoever.

there's still scouting to be done after my model is applied for sure, since my model isn't complete in the first place, and I'm sure scouting could help avoid some of the mistakes my model gives up since some players have obvious flaws (like Mike Danton) that would keep a team from drafting a certain player, but can't be added in to the model yet.

also, my model can't predict high school players and players out of the USHL, much like Billy Beane with his high school players. It's not that they can't be good, but right now a coin flip is more accurate for them than my model.

Moneyball isn't necessarily about stats. It's about finding undervalued players created by inefficiencies in the marketplace. It just so happens that Billy Beane did it by evaluating players based on stats. However, since every team in MLB has adopted it, the strategy isn't as effective anymore since the reason why the strategy worked so well in the first place was because only a few teams that were using it.

I don't think that the NHL or NBA would adopt a moneyball-esque strategy based on stats because of the nature of the respective sports but based on the Lewis article about Shane Battier, good players are slipping through the cracks in sports that you can measure as accurately with statistics.

I kinda think the NFL has a moneyball method. Just look at the Indianapolis Colts. For a decade, they've been hovering near the NFL hard salary cap and every year they lose seemingly important pieces of their team yet every year they make the playoffs. If you look through their current roster, you'll see a lot of late round picks (save for Peyton Manning) and players that go against conventional football wisdom due to being undersized (Bob Sanders, Dwight Freeney). For the Colts, moneyball isn't so much about stats but finding undervalued players that are suited to playing their particular system.

Sounds good in theory, but what traits are the best predictors of NHL success in 17-18 yr olds?

At Puck Prospectus, Iain Fyffe found that there was a much stronger correlation between a forward's OHL PPG in his draft year and NHL success than his draft position and NHL success. That suggests that something as simple as PPG could potentially improve decision making. If you combined the information the numbers give you with the subjective knowledge you have of potentially fatal flaws in a player's game, you'd probably end up ahead.

Sounds good in theory, but what traits are the best predictors of NHL success in 17-18 yr olds? You don't know exactly how much more a guy will grow, or how strong he'll get. Skating is probably one good metric as a comparison amongst peers. If you assume that all teams take the best player available (and usually that is the case) based on their career stats and physical attributes-- you get what we have today. A crapshoot after the first 6 or so picks. I guess what I'm saying is that if it were possible to improve, someone would have done it by now. It's the only time teams have access to players for free, and it remains the foundation for building teams. Maybe it's not so much the drafting, but the development. I think Detroit's success has been in finding players that fit a certain style or possess a specific set of skills, who then are developed in a very specific way.

Back to the baseball vs hockey modeling. I think you're bypassing the question. Is hockey a sport that can be modeled?

I hope this doesn't get lost, but hockey is much more a game of random events. There's simply too much going on the ice with too many variables (number of players, conditions, etc.). Baseball is somewhat static in comparison to hockey.

Fugu: Very simple--does the model draft better than most NHL teams? In science results talk and ____ walks. Either the model performs better or it doesn't.

Yes, there is a randomness in hockey. If I'm a NHL GM I'm looking at EVERYTHING that might give my organization a leg up over the competition.

If the model does better than it represents an improvement and one more piece of information. A NHL team could use it as a tie breaker if players were judged to have similar talent.

Moneyball isn't necessarily about stats. It's about finding undervalued players created by inefficiencies in the marketplace. It just so happens that Billy Beane did it by evaluating players based on stats. However, since every team in MLB has adopted it, the strategy isn't as effective anymore since the reason why the strategy worked so well in the first place was because only a few teams that were using it.

I don't think that the NHL or NBA would adopt a moneyball-esque strategy based on stats because of the nature of the respective sports but based on the Lewis article about Shane Battier, good players are slipping through the cracks in sports that you can measure as accurately with statistics.

I kinda think the NFL has a moneyball method. Just look at the Indianapolis Colts. For a decade, they've been hovering near the NFL hard salary cap and every year they lose seemingly important pieces of their team yet every year they make the playoffs. If you look through their current roster, you'll see a lot of late round picks (save for Peyton Manning) and players that go against conventional football wisdom due to being undersized (Bob Sanders, Dwight Freeney). For the Colts, moneyball isn't so much about stats but finding undervalued players that are suited to playing their particular system.

The guys at BP point out that organizations can also gain efficiencies by doing things other than using stats. For example the Houston Astros are more willing to give short right handed pitchers a shot in the draft than your average MLB team and that has landed them some good MLB guys. The Atlanta Braves heavily focus just on players from the south--they get focus on one area and see those players more often (i.e. larger sample for each player) and they draft pretty well.

In hockey, I'd point out that I think the Buffalo Sabres have an organizational efficiency advantage--they seem more willing to draft smallish forwards--and they Sabres have drafted more NHLers than the average NHL franchise. Now I haven't crunched any numbers on this, but I suspect that Buffalo is exploiting the size fetish most other NHL teams have when it comes to prospects.

The guys at BP point out that organizations can also gain efficiencies by doing things other than using stats. For example the Houston Astros are more willing to give short right handed pitchers a shot in the draft than your average MLB team and that has landed them some good MLB guys. The Atlanta Braves heavily focus just on players from the south--they get focus on one area and see those players more often (i.e. larger sample for each player) and they draft pretty well.

In hockey, I'd point out that I think the Buffalo Sabres have an organizational efficiency advantage--they seem more willing to draft smallish forwards--and they Sabres have drafted more NHLers than the average NHL franchise. Now I haven't crunched any numbers on this, but I suspect that Buffalo is exploiting the size fetish most other NHL teams have when it comes to prospects.

no doubt in my mind the Sabres have some form of model that gives them an advantage on smaller forwards. A couple of years ago they said they cut back on scouting and would work more with stats and tapes and since that time they clearly go the small forward route like Ennis that gives them the luxury to let high paid forwards go and keep the low pay guys.

no doubt in my mind the Sabres have some form of model that gives them an advantage on smaller forwards. A couple of years ago they said they cut back on scouting and would work more with stats and tapes and since that time they clearly go the small forward route like Ennis that gives them the luxury to let high paid forwards go and keep the low pay guys.

And look where that's got them?

Dominated with mentally and physically soft players and two missed playoff years in a row.

and that's exactly what I'm doing, the model proposed by James is well done for baseball, but you can't think of applying the same model to hockey since it would be very hard to do. However, with a few twists here and there and use of other stats tools you can get something decent. My draft success ranges from 80% to 100%, and that takes into accounts late picks and the percentages stand down to 1990, and it tells you to get exceptions like Martin St. Louis, Andrei Markov and Johan Franzen, so it's not only the early picks.

What data did you use to create the model? If you used a sample of historical data to create a model that "predicts" the success of players within that sample, then great, you've fit some equations to a set of numbers. "Predicting" St. Louis, Markov, and Franzen is a lot easier when you've created the model on a sample that includes their numbers.

How well does the model predict results outside of the data sample that created it? Have you run any tests on that? That's the true test of the model's usefulness.

Quote:

Originally Posted by FSU Seminoles

no doubt in my mind the Sabres have some form of model that gives them an advantage on smaller forwards. A couple of years ago they said they cut back on scouting and would work more with stats and tapes and since that time they clearly go the small forward route like Ennis that gives them the luxury to let high paid forwards go and keep the low pay guys.

I'd be interested to know what their model is and how they use it in their decision-making process. If their model predicts offence alone, the smaller forwards will stand out as bargains, but a team of small forwards will generally be weaker defensively, as defensive ability tends to go along with size and reach.

I'd agree that size has been overrated by NHL teams in the past, but draft models that only take offensive output into consideration will overrate small players as a group at least to some degree, IMO.

At Puck Prospectus, Iain Fyffe found that there was a much stronger correlation between a forward's OHL PPG in his draft year and NHL success than his draft position and NHL success. That suggests that something as simple as PPG could potentially improve decision making. If you combined the information the numbers give you with the subjective knowledge you have of potentially fatal flaws in a player's game, you'd probably end up ahead.

That actually isn't too hard to believe. While PPG isn't the end all be all for a prospect (just ask Corey Locke) there is absolutely something to be said about being able to be a productive player offensively.

A factor that I've been looking at over the past few drafts is percentage that a player leads his team by in points. Generally if they lead by over 15% on a low scoring team they're worth watching. A guy like Landon Ferraro in this draft is going to be underrated looking strictly at PPG, but he was the Red Dear leader by over 30% this year in terms of scoring and PPG so he's a player on my radar that's likely to have a productive NHL career. Unfortunately I have only been tracking this for a couple drafts now so I don't have much raw data to go on to say how accurate that is or not.

If I can get my programing skills up to par I may be able to make something to help project but until then I'm just going on gut intuition and statistical analysis. If I have time the next few weeks I'll likely add in the 05 draft and tweak my formulas a bit after talking with FSU Seminoles. He had some pretty sound ideas in regards to the draft on this that could definitely add to what I've been trying to do with my own formulas for both the draft and the NHL level.

Could an argument be made that basketball be easier to model than football? From a play-by-play standpoint considering position types involved in each play.

A discussion for another thread perhaps.

Modeling hockey may indeed be most difficult by comparison, however, this is about using the output, the stats, to gain an value-to-skill advantage.