View Single Post
Old
05-09-2012, 07:01 PM
  #15
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
I think anyone using or assuming Gausian distribution in their adjustments should take a look at the results of this large study which included NHL players.

I'm not an expert in this so I would encourage interested readers to read the study rather than simply accept my summary.

http://onlinelibrary.wiley.com/doi/1...1.01239.x/full

I saw some criticism but nothing that seemed more deep than a casual defense of the bell curve.

In short the idea seems to be that a small percentage of elite performers including outliers have a disproportionately large impact on the average result. The rest fall below the mathematical average of the group.

The other end of the scale, negative performances have the same impact and no they don't balance out.

They show that a bell curve doesn't match the data without massaging the data first. the data aligns much more closely to a Paretian http://www.vigorinnovation.com/from-...-the-long-tail or power curve.

I suppose one could say that the outliers at both ends are the data. Eliminating or normalizing them distorts the data.

So when examining data for a season say goals scored or save % one must look at the outliers for explanation since they are most responsible for the data. Adjusting them to make the curve smoother or bell-like is wrong. That just ignores and distorts the most meaningful data.

It appears to support the 80-20 rule. 20% of sales people are responsible for 80% of the sales. Within that group the same rule applies. Furthermore this appears to apply within a team, season or career.

Similarly a small percentage of the group is responsible for most of the negative stats.

So a season with a few really bad goalies would greatly impact the overall data just as a season with a few really good goalies.

Normalizing this data creates the false impression that the group as a whole were better or worse. It would also serve to lessen the performance of the good goalies or enhance the performance of the bad goalies.

I did see a link to using Excel to calculate power curves which may interest those of you who've amassed the raw data.

The first two paragraphs-

"We revisit a long-held assumption in human resource management, organizational behavior, and industrial and organizational psychology that individual performance follows a Gaussian (normal) distribution. We conducted 5 studies involving 198 samples including 633,263 researchers, entertainers, politicians, and amateur and professional athletes. Results are remarkably consistent across industries, types of jobs, types of performance measures, and time frames and indicate that individual performance is not normally distributed—instead, it follows a Paretian (power law) distribution. Assuming normality of individual performance can lead to misspecified theories and misleading practices. Thus, our results have implications for all theories and applications that directly or indirectly address the performance of individual workers including performance measurement and management, utility analysis in preemployment testing and training and development, personnel selection, leadership, and the prediction of performance, among others.

Research and practice in organizational behavior and human resource management (OBHRM), industrial and organizational (I-O) psychology, and other fields including strategic management and entrepreneurship ultimately build upon, directly or indirectly, the output of the individual worker. In fact, a central goal of OBHRM is to understand and predict the performance of individual workers. There is a long-held assumption in OBHRM that individual performance clusters around a mean and then fans out into symmetrical tails. That is, individual performance is assumed to follow a normal distribution (Hull, 1928; Schmidt & Hunter, 1983; Tiffin, 1947). When performance data do not conform to the normal distribution, then the conclusion is that the error “must” lie within the sample not the population. Subsequent adjustments are made (e.g., dropping outliers) in order to make the sample “better reflect” the “true” underlying normal curve. Gaussian distributions are in stark contrast to Paretian or power law distributions, which are typified by unstable means, infinite variance, and a greater proportion of extreme events. Figure 1 shows a Paretian distribution overlaid with a normal curve.
"

Dalton is offline   Reply With Quote