HFBoards

Go Back   HFBoards > General Hockey Discussion > Prospects
Mobile Hockey's Future Become a Sponsor Site Rules Support Forum vBookie Page 2
Prospects Discuss hockey prospects from all over the world and the NHL Draft.

Did some statistical programming

Closed Thread
 
Thread Tools
Old
07-14-2005, 03:38 AM
  #16
King'sPawn
Enjoy the chaos
 
King'sPawn's Avatar
 
Join Date: Jul 2003
Posts: 8,332
vCash: 500
Seems like there's a flaw. Isn't the percentage of picking based upon the number of balls that are removed?

For example, let's say your team is Detroit. You have 1 ball out of 48. That's 2.08% chance of getting the #1 pick. Now let's say Toronto gets the #1 pick. That means Detroit would have a 2.12% (1/47) chance of getting the second overall pick. However, if Buffalo gets the 1st overall pick, Detroit would have a 2.22% (1/45).

Now let's say Toronto and Philadelphia get #1 and #2. That would mean the Red Wings have a 2.17% (1/46) chance of getting the third overall pick. However, if Buffalo and New York get the top 2 picks, the Red Wings would have a 2.38% (1/42) chance of getting third overall.

If all three ball teams get picked first, that would give the Red Wings a 2.78% (1/36) chance of picking 5th, as opposed to a 2.32% chance of picking 5th if four one-ball teams go first.

It's good to illustrate a point, but it's not exact.

King'sPawn is offline  
Old
07-14-2005, 07:27 AM
  #17
Potted Plant
Registered User
 
Join Date: May 2003
Location: Tuscaloosa, AL
Posts: 858
vCash: 500
Send a message via AIM to Potted Plant
I took that into account in my program.

The x array represents all of the balls in the balls in the hopper (ball 49 is just a place holder). They're initialized to different variables to represent different "kinds" of draft possibilities.

Teams with 3 draft balls are initialized to have their X values set to 5, 3, and 4. Five means that when that ball is picked, you have to register that one and the next two as having been picked. Three means that you set that one, the previous one, and the next one to register it having been picked. Four means you reset that one and the previous two. (5 then represents the "first" ball, 3 represents the "second", and 4 represents the "third"). I realize the notation is confusing and if I truly meant for it to be shared with others who would modify it for their own purposes or something like that, I would have gone back and rewritten it to be easier to follow.

To be complete, teams with two balls have their X values initialized to 1 and 2. One means you reset that ball and the next one. Two means you reset that ball and the previous one. One-ball teams have theirs set to 0, which means to just look at that ball.

I register that a team has been picked by changing all of its X-values to 6.

The variable "var" is the ball that is picked. After picking a ball, I go through and check whether it, or one of its companions, has been picked before. If it has, I just pick a new one.


Last edited by Potted Plant: 07-14-2005 at 07:44 AM.
Potted Plant is offline  
Old
07-14-2005, 07:43 AM
  #18
Potted Plant
Registered User
 
Join Date: May 2003
Location: Tuscaloosa, AL
Posts: 858
vCash: 500
Send a message via AIM to Potted Plant
Oh, I think I understand your question now. You're pointing out that your percentages of getting a certain pick is dependent on exactly who is picked in front of you. Well, it was actually the primary goal of my simulation to take that into account. That's why I had the program simulate the draft 200,000 times. That way, you start averaging out all the variation that comes from the differences you talk about.

My program does not break it down by those variables. It averages them out. If you want to figure out what happens at pick #2 assuming pick #1 goes to a certain team, it's pretty easy. There are 48 balls. If a 3-ball team gets #1, there are then 45 balls. The remaining 3-ball teams have a 3/45 chance of getting pick #2. The 2-ball teams have a 2/45 chance, and the 1-ball teams have a 1/45 chance. If it's a two-ball team or a one-ball team that gets #1, just add 1 or 2 to the denominators above, and you get the answer.

The program starts from time zero (now) and predicts your team's individual chances of getting whatever pick. Once the first ball is picked, my statistics won't help you anymore.

Potted Plant is offline  
Old
07-14-2005, 10:32 AM
  #19
MaV
Registered User
 
Join Date: Jun 2002
Posts: 479
vCash: 500
Those numbers aren't really accurate yet though, or? I mean, shouldn't the three balls holders simply have 3/48 chance for the first pick? That's 6,250 and your number is slightly less. The same with three ball holder for second pick 9/48*3/45 + 20/48*3/46 + 16/48*3/47 is 6,095, again your simulation gives slightly smaller chances. So, maybe it would need to be run even more times to get the numbers right. It's very complicated system anyway.

MaV is offline  
Old
07-14-2005, 10:44 AM
  #20
MojoJojo
Registered User
 
MojoJojo's Avatar
 
Join Date: Jan 2003
Location: Philadelphia
Posts: 9,351
vCash: 500
I'm not sure you did this right, though I am a C programmer and can only generally follow what you did. Basically the problem is that you need to account for the number of balls removed from the pool with each draft pick. In a multiball system this is difficult, because the number changes depending on who got what pick before the pick you are calculating, ie, for the second pick you need to add up the probability that first pick was taken by a one, two and three ball team. The third pick you need to add up the probablities that between 2 and 6 balls were removed in the first two picks, etc.

MojoJojo is offline  
Old
07-14-2005, 12:09 PM
  #21
King'sPawn
Enjoy the chaos
 
King'sPawn's Avatar
 
Join Date: Jul 2003
Posts: 8,332
vCash: 500
Thanks for clarifying, HRR. I misunderstood the premise of your program

So basically you ran the program x amount of times, and you averaged out the results? Meaning in any given simulation, there was probably a time when all of the 1 balls were drafted before anyone else... but there was another time when all of the 1 ball teams were drafted last? Your program averaged it out like that?

If that's what you did, that makes perfect sense. You were aiming more for a general likelihood of each spot as opposed to giving exact numbers.

King'sPawn is offline  
Old
07-14-2005, 12:14 PM
  #22
NYRangers
Registered User
 
Join Date: Aug 2004
Posts: 2,853
vCash: 500
The 4 "bad" teams have a 25% chance of getting the top pick.
The 10 "mediocre" teams have a 41.6% chance of getting the top pick.
The 16 "good" teams have a 33.3% chance of getting the top pick.

Crosby is probably going to a somewhat good team.

NYRangers is offline  
Old
07-14-2005, 12:46 PM
  #23
Potted Plant
Registered User
 
Join Date: May 2003
Location: Tuscaloosa, AL
Posts: 858
vCash: 500
Send a message via AIM to Potted Plant
Quote:
Originally Posted by MaV
Those numbers aren't really accurate yet though, or? I mean, shouldn't the three balls holders simply have 3/48 chance for the first pick? That's 6,250 and your number is slightly less. The same with three ball holder for second pick 9/48*3/45 + 20/48*3/46 + 16/48*3/47 is 6,095, again your simulation gives slightly smaller chances. So, maybe it would need to be run even more times to get the numbers right. It's very complicated system anyway.
No, it's not perfect. The problem is that I only had the program run the draft 200,000 times. It's probably accurate to within a factor of 1-2%. It's not designed to get rock-solid numbers. It's designed to converge to the right answer. It will just take an infinite number of runs to get it perfect. I tried it with 1,000,000 drafts, but the computer I was on wouldn't run it. I'd get overflow errors. To get better numbers, you can just open up an Excel Spreadsheet, go to the VB editor and copy my program into it. Change the 200,000 number to 10 million and see if your machine will run it. It will output the absolute number of times a certain 3-ball team got each pick (I followed balls 1, 2, and 3 for this purpose), the number of times a certain 2-ball team got each pick (I followed balls 13 and 14 for this), and the number of times a certain 1-ball team got each pick (ball 30 I think). Just divide each result by 10,000,000 and multiply by 100 to get percentages. The results will be more accurate than what I got.

Potted Plant is offline  
Old
07-14-2005, 12:52 PM
  #24
Potted Plant
Registered User
 
Join Date: May 2003
Location: Tuscaloosa, AL
Posts: 858
vCash: 500
Send a message via AIM to Potted Plant
Quote:
Originally Posted by MojoJojo
I'm not sure you did this right, though I am a C programmer and can only generally follow what you did. Basically the problem is that you need to account for the number of balls removed from the pool with each draft pick. In a multiball system this is difficult, because the number changes depending on who got what pick before the pick you are calculating, ie, for the second pick you need to add up the probability that first pick was taken by a one, two and three ball team. The third pick you need to add up the probablities that between 2 and 6 balls were removed in the first two picks, etc.
It is done. What I did accounts for that. I assigned all balls to their teams. When one team's ball was picked, I registered all their other balls as picked as well. The mechanics of the program simulate it like this:

Starting with pick #1
1. Pick a ball
2. Check to see that the ball hasn't been picked already.
3. If it hasn't, register that ball as picked.
4. Register each of the team's other balls as picked as well.
5. Put the ball back in the bin
6. Check to see if this is one of the three teams I'm following.
7. If so, register where that team is picking in the order.
8. Go to the next pick and start again.

The same ball could easily be picked twice, but those are just discarded. The pick is also discarded and tried again if another of the same team's balls is picked.

It would be very hard to figure out with mathematical certainty what the chances are. It's easier this way, running a mock draft a couple hundred thousand times and just counting up the results.

Potted Plant is offline  
Old
07-14-2005, 01:39 PM
  #25
MojoJojo
Registered User
 
MojoJojo's Avatar
 
Join Date: Jan 2003
Location: Philadelphia
Posts: 9,351
vCash: 500
Quote:
Originally Posted by MaV
The same with three ball holder for second pick 9/48*3/45 + 20/48*3/46 + 16/48*3/47 is 6,095, again your simulation gives slightly smaller chances.
You also need to multiply that total by the odds that the team did not get the first pick, which is 1 minus the odds they got it.

MojoJojo is offline  
Old
07-14-2005, 01:47 PM
  #26
MojoJojo
Registered User
 
MojoJojo's Avatar
 
Join Date: Jan 2003
Location: Philadelphia
Posts: 9,351
vCash: 500
Quote:
Originally Posted by HighlyRegardedRookie
It would be very hard to figure out with mathematical certainty what the chances are. It's easier this way, running a mock draft a couple hundred thousand times and just counting up the results.
OK, I see how it works now. Thats probably the easiest way, even if its not an explicit solution.

MojoJojo is offline  
Old
07-14-2005, 01:56 PM
  #27
ceber
Registered User
 
Join Date: Apr 2003
Location: Wyoming, MN
Country: United States
Posts: 3,500
vCash: 500
http://hfboards.com/showpost.php?p=3...&postcount=196

I think those were the results of a billion-run simulation. Should be pretty close to theoretical, from what I understand.

ceber is offline  
Old
07-14-2005, 02:03 PM
  #28
MojoJojo
Registered User
 
MojoJojo's Avatar
 
Join Date: Jan 2003
Location: Philadelphia
Posts: 9,351
vCash: 500
One thing thats interesting is that all teams have roughly the same odds at getting the 17th pick.

MojoJojo is offline  
Old
07-14-2005, 02:41 PM
  #29
MaV
Registered User
 
Join Date: Jun 2002
Posts: 479
vCash: 500
Quote:
Originally Posted by MojoJojo
You also need to multiply that total by the odds that the team did not get the first pick, which is 1 minus the odds they got it.
No no, no need for that. I mean, 9/48 is the chance of some other three ball holder to get the 1st pick, 20/48 two ball holder and 16/48 one ball holder. You know, then the team in question could not have got the pick.

But anyway, from math to the origianl question. Rangers have only ~54% chance to get top-10 pick. So it's not guaranteed.

MaV is offline  
Old
07-14-2005, 04:34 PM
  #30
Patman
Registered User
 
Join Date: Feb 2004
Posts: 321
vCash: 500
Quote:
Originally Posted by ceber
http://hfboards.com/showpost.php?p=3...&postcount=196

I think those were the results of a billion-run simulation. Should be pretty close to theoretical, from what I understand.
Yeah, and if news sources want to use those numbers (or want to see the full simulation) I have no problem with it. I am fully confident that those numbers are accurate to the digits listed but after that it's sketchy. Standard error calculations maximized the variance at 0.00158%... 3 standard deviations in either direction is about 0.009% for a ball park in the deviation estimates for each team when done team by team... realize that my calculations are further averaged since we have teams with the same weights binned together so they are even more accurate than the 0.009% figure... but not too much more accurate.

If the media does want to use this I'd just ask them to contact me at patrick.joyce@huskymail.uconn.edu . I admit this board gets rather close to public domain but I'd love to see my name in print or on TV :p

Patman is offline  
Closed Thread

Forum Jump


Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -5. The time now is 11:41 AM.

monitoring_string = "e4251c93e2ba248d29da988d93bf5144"
Contact Us - HFBoards - Archive - Privacy Statement - Terms of Use - Advertise - Top - AdChoices

vBulletin Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.
HFBoards.com is a property of CraveOnline Media, LLC, an Evolve Media, LLC company. 2015 All Rights Reserved.