1641 Sporting Kansas City (+9) 
1620
Los Angeles Galaxy (+8) 
1616
Houston Dynamo () 
1615
Seattle Sounders FC (+8) 
1574
Real Salt Lake (9) 
1564
New York Red Bulls (5) 
1540
San Jose Earthquakes (+5) 
1527
Chicago Fire () 
1502
Colorado Rapids (8) 
1483
FC Dallas (+10) 
1464
DC United (+16) 
1460
Columbus Crew (12) 
1454
Philadelphia Union (+12) 
1443
Chivas USA (+15) 
1436
Montreal Impact (10) 
1431
Portland Timbers (8) 
1424
Vancouver Whitecaps FC () 
1374
New England Revolution (16) 
1332
Toronto FC (15) 

West Average: 1514.66 
East Average: 1486.8 
Very fittingly, the Chicago vs. Houston game that had a 70 minute delay, was called at 66 minutes, and ended at 11 had a 0 point effect on Chicago and Houston's ratings. Not a lot of movement this week. Kansas City's predicted end of season point total is starting to rival the all time high (70 pts  1998 LA Galaxy). We'll have to see if they can maintain this form.
Team 
Pts 
Pred 
Sporting Kansas City 
18 
69.505 
Houston Dynamo 
7 
61.626 
Seattle Sounders FC 
10 
58.943 
Real Salt Lake 
15 
57.973 
Los Angeles Galaxy 
6 
55.616 
New York Red Bulls 
10 
54.250 
San Jose Earthquakes 
13 
53.680 
Chicago Fire 
5 
49.039 
Colorado Rapids 
9 
47.452 
FC Dallas 
10 
44.954 
DC United 
8 
44.578 
Columbus Crew 
6 
42.676 
Chivas USA 
9 
40.984 
Philadelphia Union 
4 
39.397 
Vancouver Whitecaps FC 
8 
39.160 
Montreal Impact 
4 
36.183 
Portland Timbers 
4 
34.777 
New England Revolution 
6 
34.362 
Toronto FC 
0 
24.710 
This comment has been removed by the author.
ReplyDeleteHow do you calculate the predicted point totals, given that the outcome value is based on a win=1 and a draw=1/2, rather than a win=3 points and a draw=1 point? For example, if my math is right, SKC's rating after winning at Vancouver is now 1655 and their expected outcome at Portland is 0.684. How do you convert that to an expected number of points? Should the methodology instead assign win=1 and draw=1/3?
ReplyDeleteUnderlying motivation: If there were only wins and losses, then we could simply attribute a chance of 0.684 to a win and 0.316 to a loss (for example). However, we also have to take draws into consideration. You can't just have draws be a set probability, because a team with expected value 0.999 is going to have a lot less chance of a draw than a team with a 0.500 expected value.
ReplyDeleteSo, with all that in mind, I created a handwavy "eyeballing it" formula that modeled those ideas the best I could. I also tried to get as close to the 2.75 points earned (by both clubs) per game average as possible (That is to say, through the first 16 seasons of MLS, 8993 points have been earned over 3267 games, or 8993/3267 = 2.75ish points per game). What I did was start out by assuming that two teams that go into a match with an exactly 50/50 chance of winning have a even chance of each outcome: 1/3 to win, 1/3 to draw, 1/3 to lose. I then created a curve that trended the 1/3 probability to 0 as the expected outcome went to 1.000 or 0.000 (so it's a symmetric curve). The formulas ended up looking like this:
Given the expected outcome of a game E[X], the chance of a draw/win/loss are:
Draw: 1/3  ABS(E[X]0.5)^(LN(1/3)/LN(1/2))
Win: (1[Draw])*E[X]
Loss: (1[Draw])*(1E[X])
(or alternatively, Loss = 1[Draw][Win] since the probabilities add up to 1)
Where ABS is absolute value, LN is the natural logarithm, and [Draw] is the calculated probability of a draw on the "Draw:" line.
Obviously, this isn't a particularly scientific approach, it's just an estimation. But being as no one has ever come up with a way to predict the outcome of a game accurately, I'm okay with it, and the end of season points seem to be coming out alright using this methodology.
So then, back to the SKC at Portland example, we then calculate that the chance of a draw is 0.2651, the chance of a Portland win is 0.2324, and the chance of a SKC win is 0.5025. That means that Portland's expected number of points is 0.2324*3+0.2651 = 0.9622 and SKC's expected number of points is 0.5025*3+0.2651 = 1.7277.
I figured that you were doing something like this but was interested in seeing the details. Thanks for sharing! I hope that you will accept some constructive criticism.
ReplyDeleteFirst, it seems to me that the baseline distribution of outcomes should be calibrated to the average number of points per game, or viceversa. If 1/3 of a league's games are draws, that translates to an average of (2/3)(3)+(1/3)(2)=2.67 points per game; and if a league averages 2.75 points per game, then 32.75=0.25=25% of its games are draws. So your methodology is inconsistent in this regard.
Furthermore, although 2.75 ppg apparently comes from complete historical dataI assume that you counted shootout wins as drawsover the three most recent full seasons (20092011), MLS has been awarding only 2.70 ppg, because 30% of its games have been draws. On the flip side, so far this year, just 8 of 55 MLS games (15%) have been draws, raising the average to 2.85 ppg.
Another issue with your approach is how you are calculating the probability of a win, P(W), once you establish the probability of a draw, P(D). Given that the value of a draw is 0.5 and the value of a win is 1.0 in the Elo system, it should be the case that the Elo expected outcome value E=0.5*P(D)+1.0*P(W). This means that P(W)=E0.5*P(D), not P(W)=[1P(D)]*E as you indicated above.
Finally, I am not a big fan of using natural logarithms in arbitrary equations, so I developed my own alternative formula for adjusting the draw probability using a simple parabola. If E is the Elo expected outcome value for Team A when it plays against Team B, and T is the assumed probability of a draw for equally matched teams, then for Team A:
P(D) = 4*T*E*(1E)
P(W) = E0.5*P(D) = E*[12*T*(1E)]
P(L) = 1P(D)P(W) = (1E)*(12*T*E)
Expected points E(P) = 3*P(W)+P(D) = E*[32*T*(1E)]
For SKC at Portland (E=0.684) using my equations:
With T=1/3  P(D)=0.288, P(W)=0.540, P(L)=0.172, E(P)=1.907
With T=0.30  P(D)=0.259, P(W)=0.554, P(L)=0.186, E(P)=1.922
With T=0.25  P(D)=0.216, P(W)=0.576, P(L)=0.208, E(P)=1.943
With T=0.15  P(D)=0.130, P(W)=0.619, P(L)=0.251, E(P)=1.987
For SKC at Portland (E=0.684) using your P(D) equation and the corrected P(W) equation:
With T=1/3  E(D)=0.265, E(W)=0.551, E(L)=0.184, E(P)=1.919
With T=0.30  E(D)=0.247, E(W)=0.560, E(L)=0.193, E(P)=1.928
With T=0.25  E(D)=0.216, E(W)=0.576, E(L)=0.208, E(P)=1.943
With T=0.15  E(D)=0.140, E(W)=0.614, E(L)=0.246, E(P)=1.981
It is interesting to note that our two different P(D) equations give pretty similar results across the board (identical for T=0.25), and that the expected number of points is not terribly sensitive to the value of T. That being the case, and given the variations within the historical data, I am inclined to go with my P(D) equation and the "neutral" assumption of 1/3 wins, 1/3 draws, and 1/3 lossesi.e., an average of 2.67 points per game. It would be interesting to see what effect this has on your predicted points table.
Thoughts?
Very good post. Don't think I'm ignoring you, but I'm leaving town shortly and won't be back until Sunday. I'll give you a proper reply once I get back.
ReplyDeleteA couple more observations, since I am not getting very much work done today anyway.
ReplyDeleteUsing my equations with T=1/3, the point at which a team is more likely to win than noti.e., P(W)>=0.5is when its Elo expected outcome E>=0.6514, corresponding to a difference in Elo ratings of 108.60 or greater after homefield advantage is incorporated. For T=0.25, which matches the historical data better and makes our P(D) equations identical, this point is E>=0.6180 (the golden ratio!) and a rating difference of 83.60 or greater after homefield advantage is incorporated.
Speaking of which, I just noticed that you came up with the adjustment factor of 90 for homefield advantage based on setting the Elo expected outcome equal to the percentage of points earned by home teams. This is not correct; instead, you should have calculated the average Elo outcome for home teams as (1*1612+0.5*808)/3267=0.6171. This translates to an adjustment factor of 82.89, rather than 90.67.
I am intrigued by the fact that homefield advantage appears to correspond almost exactly to the tipping point where a team is more likely to win than not. The fact that it occurs right at the golden ratio just adds to the aesthetic appeal of this discovery. It is simply too elegant to pass upso now I would advocate using T=0.25 and a homefield adjustment factor of 84. Thus:
P(D) = E*(1E)
P(W) = 0.5*E*(1+E)
P(L) = 10.5*E*(3E)
E(P) = 0.5*E*(5+E)
Historically, for home teams in MLS, E(P)=5644/3267=1.728. For a rating difference of 90, E=0.6267, P(W)=0.5097, and E(P)=1.763; a bit too high. For a rating difference of 84, E=0.6186, P(W)=0.5006, and E(P)=1.738; pretty close! Any chance I can convince you to recalculate all of the ratings using 84 in place of 90?
I got busy last night and this afternoon and managed to generate my own ratings using a homefield advantage factor of 84. I started with the 2011 season, setting every team at 1500 at the beginning, but still got results at the end that were remarkably similar to yours. The only differences in the order of the teams were that Seattle/Houston and Colorado/Philadelphia traded places, while Dallas came out behind San Jose and Portland rather than ahead of them. The largest difference in the rating itself was 27 for Chivas.
ReplyDeleteOne thing I noticed is that teams who made the playoffs but then lost at home were heavily penalized. Philadelphia, Dallas, and Colorado all saw their ratings lowered by more than 30, dropping them below teams that did not make the playoffs at all. I am not convinced that this is appropriate when setting the initial ratings for 2012, so I ran the numbers both ways. Here are the results through last Wednesday (04/18) with playoff games included:
1. SKC 1654 6. NYR 1554 11. CHV 1466 16. POR 1430
2. HOU 1621 7. SJE 1539 12. FCD 1465 17. VAN 1413
3. SEA 1610 8. CHI 1523 13. PHL 1464 18. NER 1385
4. LAG 1605 9. COL 1496 14. CLB 1453 19. TOR 1346
5. RSL 1560 10. DCU 1477 15. MON 1439
Here are the results ignoring playoff games:
1. SKC 1647 6. RSL 1554 11. FCD 1489 16. POR 1429
2. SEA 1624 7. SJE 1539 12. DCU 1479 17. VAN 1414
3. HOU 1576 8. CHI 1523 13. CLB 1466 18. NER 1383
4. LAG 1555 9. COL 1521 14. CHV 1463 19. TOR 1348
5. NYR 1555 10. PHL 1494 15. MON 1441
I also projected points using my equations above. Here are those results with playoff games included, which reflect an average of 2.80 ppg, consistent with the smaller percentage of draws so far in comparison with the historical data:
1. SKC 75.5 6. SJE 55.8 11. FCD 43.1 16. MON 35.4
2. HOU 65.7 7. NYR 55.1 12. CHV 43.0 17. POR 33.7
3. SEA 62.5 8. CHI 50.1 13. CLB 41.4 18. NER 33.2
4. RSL 58.9 9. COL 47.9 14. PHL 39.9 19. TOR 23.1
5. LAG 57.6 10. DCU 45.3 15. VAN 35.9
Ignoring playoff games, the average drops slightly to 2.79 ppg:
1. SKC 74.9 6. NYR 55.5 11. DCU 45.3 16. MON 35.9
2. SEA 64.2 7. LAG 51.3 12. PHL 43.6 17. POR 33.4
3. HOU 60.4 8. COL 51.3 13. CLB 43.0 18. NER 32.7
4. RSL 58.1 9. CHI 50.2 14. CHV 42.4 19. TOR 23.1
5. SJE 55.6 10. FCD 45.8 15. VAN 35.9
I have convinced myself that your approach of using weighting factors of 30 for regularseason games and the square root of the goal differential is reasonable. If Team A and Team B have equal ratings initially, and Team A beats Team B by a single goal three straight times, that would make the difference between their ratings equal to 84i.e., Team A would then have a 50% probability of beating Team B in a fourth match. If Team A beats Team B by two goals twice in a row, that would result in a rating gap of 80close enough to 84. There is no rigorous analysis here, just a sense that these relationships seem about right.
I look forward to your feedback when you get a chance to digest all of this. Thanks for turning me on to Elo ratings!
=============================================
ReplyDeleteOutcome of the Game:
=============================================
The outcome of the game in the original Elo formula (Win = 1, Draw = 1/2, Loss = 0) is NOT points based  nor should it be. It is simply a result indicator. Recall that an expected value of 0.500 means there is a 50/50 split on which team is going to win. An expected value of 0.333, however, means the odds are 2:1 against you, not that that you are expected to draw. The outcome of the game MUST mirror this concept. Recall that the formula for creating a new rating is Rnew = Rold + K (O  E). Despite the fact that a win is worth thrice as many points as a draw, a draw is a result that means the teams were "equal" or 50/50. That coincides exactly with an expected outcome of 0.500 or 50/50. Remember that an expected value of 0.500 does NOT mean you are expected to get half the points. So the concept here is that an expected value of 0.500 means that both teams have an equal shot, a result of a draw means the teams got an equal result, and (O  E) = (0.500  0.500) = 0 means there will be no change to the ratings at all. This should be expected. If two teams that are "equal" get an equal result, it doesn't make sense to adjust ratings up or down  they are already equal. If you change a draw to 1/3, that doesn't happen anymore and it no longer conceptually makes any sense. Another flaw with using 1/3 is discovered in application. Example:
Home: Team A (Rating: 1600)
Away: Team B (Rating: 1400)
Here we figure Team A's expected value to be 0.84. If a draw is 1/3 instead of 1/2, we then calculate the amount of change for Team A as 30 * (0.333  0.84) =  15.21 and Team B as 30 * (0.333  0.16) = 5.19. THIS IS INCORRECT. The entire premise of an Elo Rating system is that it assumes a normal distribution with mean 1500. Changing each team's rating by unequal amounts undermines your fundamental assumption, since subtracting more from Team A than you are putting in for Team B will decrease your overall average. The very important concept here is that Team A and Team B MUST go up/down by the same amount. The only way to do that is to use an outcome of win = 1, draw = 1/2, loss = 0. Again, it is important not to look at the outcome factor as points based, but that both teams got an equal result. Even if the league awarded 100 points for a win, 1 point for a loss, and 0 points for a win, we would have to use draw = 1/2 here. This is the only part in my calculations where I use 1/2 for a draw, and it is done out of necessity.
=============================================
ReplyDeleteUsage of 2.75 PPG:
=============================================
You could look at the data X different ways and come up with X different estimates, eventually you have to land on a number. My belief is that more data is more credible, so I use as much data as I can. As you yourself indicated, the average from 20092011 was 2.70, and this year has seen 2.85 so far. My estimate falls in between those other estimates quite nicely. It's also worth mentioning that 2.70 is less than a 1% deviation from 2.75, which is really good. And again, the probability of a win/draw/loss is a rough estimate, so there is really no need to kill ourselves on the accuracy.
You mentioned that my methodology was flawed for predicting the end of season points because 2.75 assumes 25% of games are drawn and that 2.67 assumes 33% of games are drawn. Recall that the assumption is that if your expected value is 1.00, you have a 100% of winning, and 0% chance of drawing or losing. So, for an expected value of 1.00, you are getting 3 PPG. If E[X] = 0.900, then that implies 2.901 PPG. E[X] = 0.800 implies 2.815 PPG. This continues all the way down to E[X] = 0.5 implying 2.67 PPG. So, the assumed PPG is not constant across all expected values. That means that the overall PPG average is somewhere between 2.67 (the lowest point) and 3.00 (the highest point). The reason I picked that funky logarithm constant was because it weighted the overall PPG average to about 2.75.
So the equation I selected satisfied three things: (1) the assumption that E[X] = 1.00 implies a 100% chance to win, and 0% chance to draw or lose, (similarly for E[X] = 0.00), (2) the assumption that E[X] = 0.500 gives an equal possibility of all three results, and (3) the overall PPG average is the most credibly estimated 2.75.
=============================================
Homefield Advantage Factor:
=============================================
Under "Outcome of the Game" we already discussed how win = 1, draw = 1/2 and loss = 0 is only appropriate for the outcome result factor. Here, it's just as important to use points weighted results. The actual advantage of a home team over an away team is the ability to acquire more points than they do.
So then I get a home field advantage factor of 90.67, which I round to 90. World Football Elo Ratings (which is the most well known soccer rater online) uses a HFA of 100. Other ratings I've seen have as low as 75. There is quite a bit of variability here, and that's because of HFA being different in different leagues and because it really doesn't have too much of an affect on the outcome. Example:
Home: Team A (1500)
Away: Team B (1500)
Team A wins.
If HFA = 75, then the amount of change is 12.
If HFA = 90, then the amount of change is 11.
If HFA = 100, then the amount of change is 11.
So again, we get into a place where there is no need to kill ourselves on accuracy.
You also mentioned that you were intrigued that the HFA corresponded exactly to the tipping point of our expected value. That is by design. Our equation is 1 / (1 + 10 ^ (([Team B's rating]  [Team A's Rating]  [HFA])/400) so when Team A's rating is exactly equal to Team B's rating + HFA, we get 1 / (1 + 10 ^ 0) = 0.500.
=============================================
ReplyDeleteThe Fall of Playoff Teams:
=============================================
Yeah, I completely agree with you here. I think it's kind of odd that this system lends itself to a team losing in the playoffs to finish below teams that didn't make the playoffs. I've seen other systems where the playoff weight factors were much higher, and I purposefully discounted them some to prevent such wild swings. It still happens, though, which is just a product of how Elo ratings work. Of course, as the first 13 seasons of MLS can attest to, the top 8 teams don't always make the playoffs, either. Also, a team can be on a really hot streak and just miss the playoffs (i.e. Chicago last year) or a team can be on a really cold streak and just make the playoffs (i.e. Columbus last year). All of that kind of lends me to believe that this isn't so much of a problem with the Elo rating system as much as it's just an oddity.
=============================================
Conclusion:
=============================================
I hope I answered your questions concisely and adequately. If you have any follow up just let me know. I have a ridiculously busy week lined up, though, so it may be about a week before I can get back to you. Thanks for the correspondence and interest!
Thanks for the detailed responses. I should be ridiculously busy myself this week, but I am the kind of person who has trouble thinking about anything else once something like this captures my interest!
ReplyDeleteRegarding "Outcome of the Game": I now fully understand why a draw has to be assigned an outcome value of 0.5, so no issues there.
Regarding "Usage of 2.75 PPG": I also now realize my error in trying to equate the draw percentage for equivalent teams with the draw percentage for the league as a whole. The latter is obviously an average over the entire range of rating differences over the course of the season, which means that two equally rated teams must have a higher draw probability than that. The "neutral" assumption of 1/3 for each outcome satisfies this, so I am back on board with it, but I still find your P(D) equation rather cumbersome. I prefer simply to adjust my parabolic alternative to P(D)=4/3*E[X]*(1E[X]). For the 2011 data, this translates to an average of 2.708 ppg, which is right between the actual value for last season of 2.654 ppg and the historical average of 2.753 ppg. While more data is indeed usually more credible than less data, it is also often the case that recent data is more credible than old data. MLS has come a long way since 1996!
Regarding "Homefield Advantage Factor": I agree that points are what ultimately matter, not Elo outcome values, but I still think that your method for producing H=90 was incorrect. You set the historical share of points earned by MLS home teams (0.6276) equal to the Elo equation for expected outcome. These are two different (albeit related) parameters; in fact, this is comparable to the mistake of assigning a draw an outcome value of 1/3, rather than 1/2. Instead, as I indicated previously, you should have set the historical average outcome value for MLS home teams (0.6171) equal to the Elo equation for expected outcome, yielding H=83. As you noted, Elo ratings themselves are not very sensitive to the value of H, but it does have a noticeable effect on win/draw/loss probability estimates for individual matchups.
Speaking of which, the "tipping point" that I mentioned was when P(W)=0.5, not when E[X]=0.5. What intrigued me was the fact that using P(D)=E[X]*(1E[X]) and P(W)=E[X]0.5*P(D) made this happen at E[X]=0.618, which is the (aesthetically pleasing) golden ratio and just happens to be very close to the historical average outcome value for MLS home teams (0.6171). However, using my corrected P(D)=4/3*E[X]*(1E[X]) shifts it to 0.651, which is not especially interesting; likewise for your P(D) equation, which shifts it to E[X]=0.644.
This brings me to an important concern that you did not address. Putting aside our different P(D) equations, you said that you are using P(W)=[1P(D)]*E[X]. Again, this is incorrect; by definition, the expected value of X is obtained by summing the products of the outcome probabilities and their respective outcome values. In other words, E[X]=1*P(W)+0.5*P(D)+0*P(L)=P(W)+0.5*P(D), so P(W)=E[X]0.5*P(D).
Regarding "The Fall of Playoff Teams": After running through various scenarios, I have convinced myself that MLS playoff games should not be used when setting the starting point for a subsequent season; they effectively constitute a separate tournament to crown a champion. Furthermore, I have come across a couple of references suggesting that K should normally be established on the basis of the amount of data used to generate the ratings, rather than a (real or perceived) "weight" for one type of game vs. another. When there are more outcomes to generate the ratings, K should be smaller; and when there are fewer outcomes, K should be larger. I am leaning toward the position that one full season is the right amount; again, the most recent information should be the most pertinent.
ReplyDeleteI will stop there for now. I have been exploring a different way of calibrating Elo ratings that would give them a more intuitive meaning. What I have in mind is a system that directly relates a team's strength to the league average, which is set at 100. If you are familiar with sabermetrics, this is similar in concept (but not application) to OPS+ for hitters and ERA+ for pitchers, so I plan to call it Elo+. More details to come.
LN(1/3)/LN(1/2): I messed up in my explanation to you last time. I said that this constant was picked to get the average PPG to 2.75  that was not true. I assumed that when E[X]=0, P(D)=0, and that when E[X]=1/2, P(D)=1/3. I wanted a parabola to connect those two points, so I made the line the most simple form of a parabola  x^k  where x is the difference from E[X]=0.5 (our midpoint where our curve is coming to its peak) and k is equal to whatever constant solves the equation. So then we have P(D) = 0 = 1/3  ABS(E[X]1/2) ^ k = 1/3  ABS(0  1/2)^k = 1/3  (1/2)^k => 1/3 = (1/2)^k => k = LN(1/3)/LN(1/2) = LN(3)/LN(2) = 1.585... . I left it in the form that I did to remind myself of how I found it. It was kind of "showing my work". I then modeled the distribution of E[X}s from every game over MLS's history as a normal with mean 0.5 and standard deviation 0.2. When I weighted the PPG to this normal distribution I found the average PPG was 2.734, which was remarkably similar to the 2.75 PPG the league was showing. It fit like a glove, so, I used it.
ReplyDeleteHomefield Advantage Factor: The "O" in K * (O  E) really has nothing to do with a point value of a win, loss, or draw. It is simply an indicator value of the result that coincides with the expected value. The homefield advantage is NOT based on an indicator. It is based on the difference between the expected points a home team is expected to win versus the expected points an away team is expected to win. Notice we are talking about the expected POINTS, here, not expected VALUE. A win is worth three times as much as a draw, and we need to account for that difference in our rating, because every time a home team wins it is three times better of a result than it is to tie. That's part of the "advantage" of winning versus drawing. The 0.6276 I am coming up with is NOT an expected value. It is simply the ratio of points won by the home side to the points won by the away side. That, in and of itself, is the "advantage" of the home team  the ability to win more points than your opponent. (If you use W = 1 and D = 1/2, you are explicitly assuming that a win is only worth twice as much as a draw  this is incorrect.) Then we take this 0.6276 (which, again, is not an expected value) and set it equal to the formula for expected value. Why do we do this? Because we want to figure out the homefield advantage factor that makes the expected value equal to the ratio of points the home team is able to win when we assume that the home team and away team are exactly equal. This factor then simulates an advantage that gets us close to the ratio of points the home team is able to win.
P(W)=[1P(D)]*E[X]: Ah, okay, I see what you are saying here. Yes, I completely agree with you. E[X]=1*P(X)+0.5*P(L}+0*P(L) => P(W)=E[X]0.5*P(D), so this formula should be being used instead of the one I was using. Good work! I'll be using this formula (and the similar formula for P(L) in the future in my calculations.
The Fall of Playoff Teams: Hm, interesting. Playoffs are the biggest games of the year, so I'm not particularly fond of discounting them altogether. The winner has to win four games straight, after all. Since Elo is basically a measure of recent run of form, those two ideas go hand in hand. My perception (YMMV) is simply that while some playoff teams may fall below nonplayoff teams, there really isn't any reason to think that the teams that made the playoffs were better than those that didn't. Again, Elo rating measures recent form quite a bit, and recent form doesn't necessarily determine who makes the playoffs and who doesn't.
The reason I don't like K being a credibility weighting is: What do you when you have a new team (Montreal) join an established league (MLS)? The problem is, K must be the same for both teams or else you fall into the problem of one team going up/down more than the other goes down/up (like we saw for D = 1/3). So what do you do when Los Angeles plays Montreal?
Our two different P(D) equations do not have much impact on the final results; yours just diminishes a bit faster as the difference in ratings between opponents increases. This is why you end up with a bit higher PPG. It is probably a matter of personal preference as much as anything.
ReplyDeleteI am still not convinced about the homefield advantage factor. As soon as you use the expected value equation, you are dealing with outcomes, not points. The points calculation comes later, after you determine the expected outcome value. Say Team A plays a home game against Team B, and both teams have the same Elo rating. Team A's expected points should be 0.6267 times the total number of expected points, right?
If we use H=90, then Team A's expected outcome (not points) is E[X]=0.6267. Applying your equation gives P(D)=0.2955, which leads to P(W)=E[X]+0.5P(D)=0.62670.5*0.2955=0.4790 and P(L)=10.29550.4790=0.2255. The expected points are thus 0.2955*1+0.4790*3=1.7325 for Team A and 0.2955*1+0.2255*3=0.9720 for Team B; i.e., Team A's expected points are 1.7325/(1.7325+0.9720)=0.6406 times the total number of expected points, about 2.2% too high.
If we use H=83 as I advocated, then Team A's expected outcome is E[X]=0.6172, which closely matches the historical data (0.6171). We can then go on to calculate P(D)=0.2999, P(W)=0.4673, P(L)=0.2328, E[P]a=1.7018, and E[P]b=0.9983; i.e., Team A's expected points are 1.7018/(1.7018+0.9983)=0.6303 times the total number of expected points, which is only 0.6% higher than the target value of 0.6267. Again, this is not going to make a huge difference over hundreds or (especially) thousands of outcomes, but it is a matter of methodological consistency.
It will be interesting to see how the corrected P(W) equation affects your points predictions.
I am fine with including playoffs for yearend "power rankings," but not when carrying the numbers over to the next season. For an expansion team, one solution is to use a larger K for all of its games and apply it to both teams in each case so that the process remains zerosum. I have not tested this to see what global effect it might have; for one thing, teams that have already played Montreal would be disproportionately affected by that one outcome.
I am drafting a potential BigSoccer post to introduce my Elo+ concept and ratings. I may run it by you here first to see what you think of it. Stay tuned!
Your P(D) is more methodical. Mine was just eyeballing it, so I like yours more.
ReplyDeleteOn the homefield factor, you are not trying to match data types. You are setting a factor so that the expected value is the same as the average points won at home. I don't know if I can explain it any more than I have, but setting W = 1 and D = 1/2 is incorrect here, those values are completely meaningless outside of the the values we use for O in our Elo rating formula. This method is used in every Elo rating that incorporates draws.
The problem with using a larger K on expansion teams is that you'd EXPECT established teams to be able to defeat an expansion team, yet you would be awarding them more in doing so. For me, there is really no incentive to try to finagle with factors to incorporate the concept  an Elo rater "autocorrects" itself over time anyway, and you already have the necessary information to look at Montreal's rating and say "that number is a bit raw, it probably is not extremely accurate yet". If you were using an extremely limited amount of data, this might be a pressing issue, but since the data we are working with already has 16+ years of experience, there is really no need to try to account for newness  it's already hashed itself out.
My P(D) is not really more methodical, just simpler. Both of our equations fit the boundary conditions, so like I said, it becomes mostly a matter of preference.
ReplyDeleteWe may just have to agree to disagree on homefield advantage. E[X] is always the expected otucome value, not the expected number of points, and therefore should be set to the average outcome value at home to find H, not the average number of points won at home. The expected number of points can only be calculated from E[X] after P(D) is established separately.
I agree that it is ultimately better to use a consistent K for all games, including those involving expansion teams. Like you said, the system is selfcorrecting over time anyway.
I sent an email to the address that you provided at the end of your methodology paper. As soon as you reply to that, I will forward my ELO+ paper, which I believe is finally finished. I am really pleased with how it turned out; ELO+ ratings actually mean something tangible, whichat least for meis not the case with conventional Elo ratings. I am eager to see what you think.
One more thing on homefield advantage: As my example above showed, using my value of H more accurately matches the targeted share of points than yours does!
ReplyDeleteThe same is truealmost exactlyusing my P(D) equation, rather than yours. For H=90, Team A's expected points are 1.724/(1.724+0.964)=0.6414 times the total number of expected points (2.3% too high). For H=83, Team A's expected points are 1.694/(1.694+0.991)=0.6309 times the total number of expected points (only 0.7% too high).
It bears repeating that this is ultimately not a big deal. If I have not convinced you by now, I probably never will; so I promise to stop trying!