View Single Post
Old
04-29-2012, 12:49 AM
  #57
Czech Your Math
Registered User
 
Czech Your Math's Avatar
 
Join Date: Jan 2006
Location: bohemia
Country: Czech_ Republic
Posts: 3,553
vCash: 50
Quote:
Originally Posted by plusandminus View Post
It's not so easy to explain. Basically I focus on the opponent's GA.

Yet, as you acknowledge, schedule likely does matter and can sometimes alter scoring stats by say 5-8 % or so. It is common here to compare players. If a player scored 4 % more points than another. But there seem to be no attention being paid to things like schedule.
If I remember right, it was fairly common to see seasonal top-ten scoring lists being altered. In some case(s) I even think it affected the leading scorer (Art Ross winner) of the season.
I was referring to your comment that you had done some work on scoring from one season to the next. Was this work also primarily focusing on the effects of schedule?

As I said before, I do remember at least some of your post(s) on the effect of schedule on team/individual scoring. I thought at the time that your work was worthwhile and your methodology seemed sound. Thank you for the additional explanation, this only further affirms my previous belief of your work.

This IMO is the type of effect that, once perfected, should be standardly incorporated into NHL adjusted statistics. You say the effects were as much as 5-8%, which is a significant amount. However, I'm guessing such large effects are more limited to teams that were extremely high/low scoring and/or to eras when the schedule was very unbalanced (mainly the 70's-80s). An example of this would be the '80's Smythe Division.

How do you properly isolate the effects, given the following:

You say you (wisely) removed the games in which Edmonton's opponents played Edmonton, effectively adjusting the team goal data for the opponents. However, isn't the opponents' data still biased to some degree due to the unbalanced schedule? I.e. if you removed from the Kings' data those games in which they played against Edmonton, how did you account for the fact that due to the unbalanced schedule, the Kings still may have played against a remaining schedule of high or low scoring teams (I would guess the former, if either). It seems like repeating the process would soon reach a limit where no further adjustment would substantially impact the results. Did you look at this factor? If so, what effect did you find and how did you further adjust for this?

Quote:
Originally Posted by plusandminus View Post
Yes, criticism is usually better than silence (although blunt and discouraging one-liners can be an exception). The end result is usually what I'm after, so suggestions on how to improve things are welcome.
What's ITT?
Any publicity is good publicity, which leads to the possibility of more being exposed to the project. Also, perhaps more importantly, we can learn about flaws in and ways to improve each study.

ITT = in the thread

Quote:
Originally Posted by plusandminus View Post
(This is not reserached yet. -->) For example, during seasons with a lot of powerplays, scoring within teams may look differently than seasons with little powerplays. Seasons with much powerplay may lead to power play specialist scoring points on a higher percentage of goals, than otherwise. Scoring during even strength appear to be much more balanced between players on a team.
There is definitely a power play effect. This can be seen in many studies, including this one. Just as it seems proper that schedule is a standard adjustment, adjusting team/individual data from even strength vs. special teams scoring data seems like it should someday be standard as well. However, just as in the "assist per goal ratio", there is no standard for what the proper "even strength to special teams" ratio of scoring should be, since it varies over time.

One benefit of the type of study I presented, is that it captures several effects without explicity defining and measuring them directly:

- changes in roster size (and therefore average ice time)
- changes in power play opportunities (and therefore changes in scoring within a team)
- changes in general strength of era
- changes in the distribution of talent within the league, primarily the depth of strength of forward talent (if scoring % changes are uneven among different types of skaters)

Quote:
Originally Posted by plusandminus View Post
I think I have posted table showing things like (made up):
Season1st2nd3rd...15th
1984-8540.236.333.0...12.5
1985-8640.735.932.5...9.6
where 1st is the average for the leading scorer on each team. 2nd is the average for 2nd best scorer on each team. And so on...
I've also posted the above but with factual, as well as adjusted, stats.
I have done similar things, for example looking at what the top 3 scorers on each team averaged and what different tiers of scorers averaged, both in comparison to league averages.

I found this interesting, but it seems to be much more dependent on other factors which are not easily removed, such as the quality and distribution of talent in the league.

Quote:
Originally Posted by plusandminus View Post
If I remember right, some thought the schedule adjusted stats still didn't do the 1980s players total justice (based on "eye-test").

Then guys like Canadiens1958 seem able to tell us about how coaching and roster sizes has changed over the years. To take an extreme example, let's compare today's NHL with the NHL where some players played 60 minutes per game (if I remember right).

By the way, adjusted points hasn't really been on my mind during the last months.
As I said, the "eye test" is familiar, but inherently subjective and therefore flawed way to primarily evaluate such results. In the case of your study, how would one even be able to say that the results "look right"?

Roster sizes changed and should somehow be adjusted for (either directly or indirectly), but the distribution of ice time likely changes disproportionately when the roster sizes change.

I haven't really been working on this project for several months, which is one reason I wanted to present it before I was less clear on some of the aspects of the study.

Quote:
Originally Posted by plusandminus View Post
I think what you have done is one piece of the puzzle, but to get it the "whole picture" needs to be integrated with other pieces.
I started studying the year-to-year changes, but found that I wanted to include more things in the equation. Age is one of those things.
I think it was during the best defenceman project that I did a fairly advanced study on strength of different seasons. I don't remember the details right now, but I think the strongest season for defencemen appeared to be around 1981. I think I didn't post it, or possibly posted it but deleted it. (It probably was yet another of those cases where people on one hand were constantly doing more or less arbitrary adjustments within their heads, but on the other hand didn't find a study trying to determine it to be of much value.)
Defensemen sounds like a difficult way to study the strength of season, which may make your results especially unique and interesting. I have thought looking at goalies would be another way to examine strength of season, but would also guess the small number of goalies (esp. in earlier eras) would yield a very small sample and less reliable data

Quote:
Originally Posted by plusandminus View Post
Thank you. I do enjoy studying stats and doing research to try to find out "how things really are (or may be)". Part of my problems may also be that I think that some things (like strength of eras, etc., etc.) ought to be "settled" and might require partly narrow studies to build upon.
You're welcome. I think we all want things to be "settled", but on issues as complex as strength of era, I don't expect things to be "settled" any time soon.

One reason I believe regression would work so well, is that it can not only attempt to simultaneously measure many variables, but produce exact coefficients for those variables and indicate which variables have insignificant effect (at least in comparison to the error which they add).

Quote:
Originally Posted by plusandminus View Post
I've been more interested in building upon your win % thinking that Overpass' thread on adjusted +/- developed into. I spent quite some time integrating SH and PP play into the study. I even "adjusted" for goaltending, which (goaltending) I think is among the most overlooked things when focusing on +/-. I was planning on posting a thread on it. I posted a small example, but got discouraging replies, got the impression that no matter how thorough and/or complete the study would be, it would just not affect the already made up minds on how things are.
Goaltending obviously influences +/- in a dramatic way. Does adjusted plus-minus factor in goaltending at all? I'm guessing it doesn't, which (if I'm correct about this) would be an instance of making an assumption out of practicality (much more work in an attempt to remove an effect which may be more random than signficant to the results).

I haven't looked at even strength win% in some time either. I think I've posted the last thoughts and formulas I had on the matter. It definitely seemed to produce some good results, just not sure the limits of its accuracy. I think the eventual end results, when combined with special teams data, could produce something similar to HR's "point shares".

Quote:
Originally Posted by plusandminus View Post
During the last 1-2 months, I've studied how team performance is affected when a player is out of the team (for example being injured). To me very interesting. I posted a chosen example showing that Pavol Demitra actually significantly made his team perform far better with him playing than when his out injured. Not during one team during one season, but season after season on 4-5 different teams. No interest whatsoever, apart from one comment more or less automatically dismissing the study.
(In the "best defencemen" project, there sometimes were mentioning of how a team performed when a player (don't remember if Eddie Shore or Sprague Cleghorn) played or not. I have done that for every player on every team since 1987-88 to 2010-11. In the project, this stat was considered meaningful, even if there was no comparison at all made to other players. When I do it, it's considered uninteresting or meaningless.)
To me, it's amazing to see Lidstrom place very highly, with his team being nearly average with him not on the team (and this not even including 2011-12, and even not counting games during end of regular season where Detroit rested players).
I would have pointed out that Gretzky didn't seem to make LAK better during the regular seasons, something that meets my own eye-ball test. But how ridiculed would I be if posting something like that?

Both of the above studies have a holistic approach, which I find is a good way to go. Compare team with a player with team without player. In my opinion more useful than studying +/- when on ice, compared to +/- when on the bench.

I have also started studying how different players actually affect each others scoring stats. For example, how did Mario Lemieux benefit by playing with Kevin Stevens, and vice versa. I can find out by filtering out games where both played, or just one of them.
I found the weighted differential in team win% with or without a player in the lineup to be a great metric, because it combined simplicity with direct measurement of what we all agree is the most important hockey value (winning). The limitation is that for players who don't miss many games, the amount of data without them is very small, so the results are very unreliable.

This would apply to teammates that rarely played separately, at least in certain situations (ES, PP, SH).

Quote:
Originally Posted by plusandminus View Post
See above (one piece of the puzzle, or rather several pieces).
I have to say I agree with some of the criticism you have received, but I suppose you basically do too. I basically agree with your replies to the replies you have gotten. You have started something good, that should be able to be improved and built upon.
Yes, I can see the potential for bias in certain areas, and I don't claim the results to be anything close to exact in magnitude. However, the general trend is clear to me, and the reasons for some of the broader and larger effects (whether over decades or from one season to the next) make sense to me.

I would break the period studied into 3 sub-periods:

'46 to '67
: The constant number of teams is a positive. However, the inherent limitation in the number of players included, when combined with the generally shorter careers of players, results in a larger uncertainty error. The potential error is compounded when comparing across longer timespans, since as Overpass points out, the multiplicative link of seasonal factors is longer. The exact magnitudes of the changes from the '50s to post-WHA seasons is certainly up for debate, but there should be little doubt that it's become much, much tougher for top players.

The broad effect is clear to me. Talent quickly files back into the league after WWII, and becomes compressed (quality depth) in the last few years before expansion.

Expansion to WHA Merger: At least from an analytical perspective, this is a decade or so of pure chaos. This makes it a quite dynamic, interesting and important time to examine. It also is one of the most difficult. The number of teams immediately doubles, with repeated expansions during the decade, while talent flows to the WHA. In addition, the result of the expansion is a glaring disparity from the top to bottom. In an era of bullies and weaklings, along with rapidly changing environment, it's a very challenging time to analyze properly.

WHA Merger to Present
: The larger number of teams, more gradual expansions, and better availability of data make this the most ideal period to examine. The main challenges are the large change in scoring from the '80s and early '90s to later years, and the large and (at least at first) somewhat disproportionate addition of talent from overseas.

Quote:
Originally Posted by plusandminus View Post
Regarding adjusted points (or goals), I think one needs to understand and keep in mind how the most common methods work. We first normalize scoring to say 6 goals per game, to make different seasons comparable.
I can't find the words properly now, but I think it's valuable to understand what we're normally doing. We have a set number of "total goals" and what the common methods does is to tell how much different players stand out compared to some sort of league average. How much they stand out depends in things like:
* How many teams were there in the league? The more teams, the more spread out quality, and the more easy it may be for the top scorers to stand out compared to their average teammate.
* What was the strength of era? Again, the higher quality per team, the more hard to stand out.
I'm very tired now, and can't think very straight, but just wanted to point out that traditional adjusted scoring has a lot to do with percentages. It's "team GF divided by league average GF" multiplied by "player's pts divided by team's GF". Or just "player's pts" divided by "league average GF".
Yes, you are basically talking about what I would term "talent compression" and "talent dilution." It is a crucial part of adjusting the data properly and one of the primary factors intended to measured in this study. In times where talent is diluted (after war, large expansion, defection to WHA) it is easier to stand out in relation to one's peers (whether in simple adjusted data or seasonal/period rankings). In times where talent is compressed (long periods without expansion, WHA merger, influx of talent from overseas) it is more difficult to stand out.

Quote:
Originally Posted by plusandminus View Post
I think you're among the better/best ones here.
Thanks. I have been happy with a lot the work I have done and have advanced my own knowledge in the process. I hope it's been of some interest and use to others as well, and it seems to have been to at least a few.

However, I know that I lack the knowledge of advanced statistics, any programming skills, and the computational resources that some others have. I try to compensate for this with rigorous logic in my methodology, genuine interest in the subject, and a natural aptitude for math.

Quote:
Originally Posted by plusandminus View Post
Thanks. I think people understand me. It's rather that I need to express myself in simple, perhaps childlike, school English, and I suspect that may affect the way I'm being perceived(?) here by some.
Perhaps you can't fully express yourself in English, but you are able to communicate clearly both through language and your analysis of data. It is often very difficult to present one's results in a form that is easy to understand even to others with an interest and aptitude in such things, let alone the "common fan." Hence the questions and misunderstandings that seem to often arise (but which are also usually helpful and enlightening in some manner).

From our limited interaction and the work of yours which I have reviewed, one of the last adjectives I would use to describe how I perceive you is "simplistic".


Last edited by Czech Your Math: 04-29-2012 at 03:42 AM.
Czech Your Math is offline   Reply With Quote