View Single Post
01-19-2013, 01:17 AM
Registered User
Eskimo44's Avatar
Join Date: Aug 2010
Posts: 5,927
vCash: 500
Originally Posted by jbeck5 View Post
Well removing one game, shouldn't be an issue...unless that game is an anomoly. In gagners case, it is. It's an extremely rare occurrence. To get a better idea of a sample, you take every sample within a range...and exclude the extremely rare anomaly's that stand out as not in the norm.

Knowing the above, it almost makes sense to leave out the phoenix it has no effect on his play since arriving to ottawa.

It's just simple math practices to find out what usually happens. Outliers pull the average one way or another...which is not giving you proper data to evaluate. It's swaying it.

I'll give you the simplest example i can.

You have someone taking 10 exams...on 9 exams, he gets 10% on each...but on one he gets everything right plus bonuses and gets 125%.

His average would be 21.5%...but it would be smarter for you to assume his next test will be 10%...therefore you take out the outlier to make a better measurement for the future. It's actually a very common practice.

Taking away 1 players 8 point game to get a better idea of how that player regularly produces.

I also understand your side of the story that you can always hand pick things to include or exclude to make your point look better.

I think the right way to go about it would be to bring up both sets of with outliers included...and one with outliers not included...and people can draw their own conclusions...don't trash either set of data, because both sets data are useful.
No it's unfair. You take the larger sample size. It's pretty clear that Gagner's production has been consistent season to season, you don't just make up random conditions on how good a game (or 1/75th of the sample size) can be. At the end of the day one outlier in a large season is called statisical variance and when we look at his season to season scoring totals it seems that the variance is allowable as it does not create an outlier over the much much much larger sample size. If they are comparing seasons relevant outliers are not based on an insignificant portion of the sample size but best based on whole sample sizes, or to say full seasons. Sam Gagner's season was not an outlier so why should he be punished for a case of statisitical variation that still regressed to a mean over the whole? The variation is extreme enough to be considered an outlier in terms of game to game performance but it still was part of a larger sample size that was not an outlier, the larger sample size is the one being compared.

Also did you consider perhaps there was a stretch of games that was an outlier for the worse? Like in his first 13 games coming back from injury he only scored 2 points in limited ice time. Is that not an outlier by your definition? Or is it not an outlier because he's had other scoring slumps? But if we say that then you agree that the larger sample size is more relevant, as you'd have to compare data over a larger sample size to know that scoring slumps are standard. Comparing data over a larger sample size gives more accurate results as it allows for statistical variation. It's not fair to discount it as it fits your argument, and if you do it should go both ways.

Also this creates an example of the sorties paradox of where do you draw the line. When is one game's production so good that it's an outlier so extreme it ought to be dismissed? What is the highest example of production that isn't an outlier? How did you decide on that number? Is it by unbiased methods, or therefore fair?

But really this is nonsense, Sam Gagner scored 8 points and it was a tremendous accomplishment. It's not a negative thing nor is it something he's uncapable of, so why would you punish him for it? It makes no sense. He did it, give credit where credit is due.

Eskimo44 is offline