Why ELO discourages antistacking

Catch-all for all development not having a specific forum.
Raveen
Posts: 9104
Joined: Wed Mar 16, 2005 8:00 am
Location: Birmingham, UK
Contact:

Post by Raveen »

That's rather clever Pook, good job /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

It will still lead to a mathematical problem though won't it? Assuming people do bail (or have to leave or get dropped or whatever) then it shifts Alleg ELO away from being a 0 sum system. Is there any mechanism for measuring the mean ELO/player (which should stabilise at 1500 I think) and adding points back into the system in the event that the system is losing points?
ImageImage
Spidey: Can't think of a reason I'd need to know anything
Flower
Posts: 252
Joined: Mon Jun 26, 2006 7:00 am
Location: K-Pax

Post by Flower »

Raveen wrote:QUOTE (Raveen @ Nov 13 2006, 02:06 PM) That's rather clever Pook, good job /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

It will still lead to a mathematical problem though won't it? Assuming people do bail (or have to leave or get dropped or whatever) then it shifts Alleg ELO away from being a 0 sum system. Is there any mechanism for measuring the mean ELO/player (which should stabilise at 1500 I think) and adding points back into the system in the event that the system is losing points?
'Elo is 0-Sum' is a myth anyway.
Elo ratings do only achieve value by being measured against other players.
As you will only be measured against active players you form a kind of subpool in the whole pool of players.
Inactive players not part of the 'active' subpool would thus take their score and ratings with them and typically detracting from the mean skill present in the active subpool.

One measure trying to prevent this were tweaking of the K-Values which serve for stabilisation by awarding higher rated players less points for a loss/win than they would do to lower rated players.
This already destroys the 0-Sum requirements.
Luckily Elo can and commonly does operate with sufficient accuracy in non 0-sum environments. The FIDE system for example evidences a slight inflation, which would be why Emanuel Lasker* though lower rated than Gary Kasparov is still considered the better chessplayer by many.
Of course it would be best if the inflation is reduced to a minimum, but trying to reduce it too stringently might cause deflation which is far worse than a minor inflation.

If I am not mistakes this is the equivalent of the K-Factor tweaking in AllegElo:

Code: Select all

ELORanking = CASE 
                              WHEN ELORanking < 800 THEN ELORanking + (p.Adjustment * (p.AdjustedModifier * 4)) 
                              WHEN ELORanking BETWEEN 800 AND 1500 THEN ELORanking + (p.Adjustment * (p.AdjustedModifier * 2))
                              WHEN ELORanking BETWEEN 1500 AND 2200 THEN ELORanking + (p.Adjustment * (p.AdjustedModifier * 1))
                              WHEN ELORanking > 2200 THEN ELORanking + (p.Adjustment * (p.AdjustedModifier * .5))
                              END,

*=His death in 1941 (yay wikipedia) would explain why Mr. Lasker is not anymore counted among the pool of active players in these days.
Last edited by Flower on Mon Nov 13, 2006 2:25 pm, edited 1 time in total.
@RT: "We've never been whores, we are misunderstood RTists."
Greator_SST
Posts: 277
Joined: Sun Jul 27, 2003 7:00 am

Post by Greator_SST »

...in my opinion, since ELO team totals were introduced, stacking has dropped way off. Games stacked over 80/20 used to be commonplace. All the time in fact. Every evening I logged on was a ridiculous stack. Now, games are far more balanced. Granted, some of the worst offenders and encouragers seem not to be around as much, but when was the last time you saw an 88/12 stack in the prime time evening hours? Used to be pretty common. Not any more.

And my last thought is, why do you people care about your rank. It's not like you high rankers are uber heroes to the rest of us. Take Lindy Hop for example. Lindy is like a vet 4 or 5. Why do you think that is? It's cause he stacks virtually every game. No one in this community sees a vet 4 anymore and thinks any different. You get up to that level, and there's only one reason, you're constantly joining the stacked side.

So get over your rank. Rank is directly related to stacking/nonstacking, and it's nice to see a system that makes it perfectly clear to all. I could care less whether my rank goes up or down. If a game is fun, that's all that matters.
...yea
Grim_Reaper_4u
Posts: 356
Joined: Wed Jul 30, 2003 7:00 am
Location: Netherlands

Post by Grim_Reaper_4u »

Raveen wrote:QUOTE (Raveen @ Nov 13 2006, 02:27 PM) If there is any evidence that stcking is now worse than it was pre-ELO then I'll conceed the point but I strongly suspect that this is not the case.
It is not worse, but it is still there. The biggest problem is all those newb Vet1 players (or high intermediate players after 2 months /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ) They screw up the results and team ELO calculation is frequently just plain wrong compared to eyeballing the teams. Currently there is a 1000% skill difference between Vet1 players (or better yet between all players between rank int6 and vet3). ELO can't see the difference between a great vet and a newb vet. Since it screws up team ELO calculation it opens the door for "interesting" tactics, which I see just about every time i login to Alleg :

- 3 hardcore vets (usually from 1 squad) + 4 random newbs
versus

- 7 (usually squadless) intermediate or newb Vet1 players

Guess who will win when team 1 has game ownership and can tweak settings to its advantage? Team 1 might even have a lot less ELO and as such would be the winner of lotsa ELO.

The fact that I got a Vet3 rank and outrank people like Noir, Cunnuk, Champy, X-Avenger and Raindog indicates that something is fishy about ELO. Long time vets can tell 90% of the time what team will win when they join the lobby and check the factions/teams/settings. If those vets decide to help the underdog team most of the time then their ELO will be too low compared to their actual skill. Same goes the other way.

Before you get started on how it will autobalance itself eventually you might want to state that that might take a thousand games per player before he even approaches his true ELO value. Currently the players on the leaderboard all have over 100 games per person and it still looks like @#(! /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> Are we willing to wait until everyone has played 1000+ games (and Weed and Aarm are finally expert9 ) until we get decent team ELO calcs?
Pook
Posts: 1758
Joined: Tue Aug 13, 2002 7:00 am
Location: Texas, USA

Post by Pook »

Flower wrote:QUOTE (Flower @ Nov 13 2006, 07:51 AM) post deleted until further data gathered.
oops seemed I believed Psychosis hypothesis about 'full loss'
need to double check this.
have a nice day /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" /> -- Flower

Ok here is more data:

Code: Select all

- LOSERS WHO DROP GET DEDUCTED FULL POINTS
         UPDATE @Accounts
            SET AdjustedGameTime = dbo.ASGSGetPlayerMaxGameTime(@GameID, Member_ID)
          WHERE Winner = 0
            AND AdjustedGameTime >= 300
This seems to indicate quite clearly that the game time is adjusted to max for those who lost.

Pook, can you confirm that this part of the code is obsolete or unrelated to the issue?
(or that I was too hasty and quoted a unrelated part of the code /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />, due to mental deficiencies on my side)
Flower, the dbo.ASGSGetPlayerMaxGameTime function returns the number of seconds from the time the player JOINED the game until the time the game ended.
Image
Pook
Posts: 1758
Joined: Tue Aug 13, 2002 7:00 am
Location: Texas, USA

Post by Pook »

Raveen wrote:QUOTE (Raveen @ Nov 13 2006, 08:06 AM) That's rather clever Pook, good job /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

It will still lead to a mathematical problem though won't it? Assuming people do bail (or have to leave or get dropped or whatever) then it shifts Alleg ELO away from being a 0 sum system. Is there any mechanism for measuring the mean ELO/player (which should stabilise at 1500 I think) and adding points back into the system in the event that the system is losing points?
ELO in allegiance isn't a zero-sum system, because you're never guaranteed that you'll have an equal number of players on both teams.

Even if a game is exactly balanced according to the rankings (i.e. each side has a .50 expected outcome) and everyone started and finished the game, it won't be zero sum if one side has more players than the other. In the above example the winning side would gain 15 ELO, the losing side would lose 15 ELO... but the number of players that actually get that adjustment is different. Team 1 may have 10 people, that's 150 points total into the system, but Team 2 may have 9 people - that's 135 points out of the system. Net adjustment isn't 0, but +15.

I've examined some alternatives, for example a "point pool" so that it's 0-sum between teams and then the team members split the point pool... but that ended up having many more issues, such as player churn causing the individual point layout to be lowered.
Image
Flower
Posts: 252
Joined: Mon Jun 26, 2006 7:00 am
Location: K-Pax

Post by Flower »

Pook wrote:QUOTE (Pook @ Nov 13 2006, 02:48 PM) Flower, the dbo.ASGSGetPlayerMaxGameTime function returns the number of seconds from the time the player JOINED the game until the time the game ended.
Whee! So it was just my mental deficiency =]
*hugs Pook* Thanks for clarification.
And I have to agree that it is indeed a quite interesting feature /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

That would mean that the deflation facilitation by this feature is far lower than I initially assumed /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />
(of course if the net Elo skill of the active players still deflates it would pose a (minor) problem nevertheless, but it would be better to tweak the K-factors instead than to allow people to bail without consequence /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />)
Last edited by Flower on Mon Nov 13, 2006 3:48 pm, edited 1 time in total.
@RT: "We've never been whores, we are misunderstood RTists."
Flower
Posts: 252
Joined: Mon Jun 26, 2006 7:00 am
Location: K-Pax

Post by Flower »

Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Nov 13 2006, 02:31 PM) The fact that I got a Vet3 rank and outrank people like Noir, Cunnuk, Champy, X-Avenger and Raindog indicates that something is fishy about ELO.
Not exactly, it could be that your are just a more active player and as such are further converged.
Or it could denote that either you or them did not join balanced games. (according to the server, not according to Ranksum)
My bet would be that it does not tell us about fishy Elo but rather about the lack of Data and thus lacking convergence. (as in fact you wrote further down)
Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Nov 13 2006, 02:31 PM) Before you get started on how it will autobalance itself eventually you might want to state that that might take a thousand games per player before he even approaches his true ELO value. Currently the players on the leaderboard all have over 100 games per person and it still looks like @#(! /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> Are we willing to wait until everyone has played 1000+ games (and Weed and Aarm are finally expert9 ) until we get decent team ELO calcs?
a) It is highly unlikely that Weedman or Aarmstrong indeed deserve an Elo rating of 3000+.
Surely they are one of our Top Players and should as such have one of the top Elo Scores in the community, but given that the old AllegElo1 used kills to modify the Rank of a player it is to be expected that the true spread in ability is smaller now that this severe inflation factor was eliminated.
It would be good to keep in mind though that even if the best players were to converge at 2200 Elo this would not denote that they are only 1.5 times better than an vet1 with 1500 Elo (2200/1500=1.47)
This is due to the property of Elo that only Ranking difference instead of quotients allow a sound skill comparison. (assuming an already converged system of course)

B) 1000 games seems to be an estimate that is a bit on the high side.
(yeah I know that I kind of made a similar estimate in some other thread)
But in order to gain a rough convergence instead of an accurate one far fewer games are required.
300 games should prove sufficient. (of course only to measure in relation to the ranks of the current population, so it is to no avail if one player rushes ahead and plays 300 or even 3000 games. It is more or less required that the average active player completes these number of games as well. Without the convergence of the rest of the playerbase it is unlikely that he would surpass the 2000 point threshold even if his true elo (once all players are converged) were 3000.
Last edited by Flower on Mon Nov 13, 2006 3:56 pm, edited 1 time in total.
@RT: "We've never been whores, we are misunderstood RTists."
Grim_Reaper_4u
Posts: 356
Joined: Wed Jul 30, 2003 7:00 am
Location: Netherlands

Post by Grim_Reaper_4u »

The problem is Flower that currently all vets fall into the Vet1 to Vet6 category even if that assessment of skill was accurate (which it isn't) the numbers are too close together. We really neeed the players spread out over more ranks to accurately reflect the difference in skills between players.

I think we all agree that Shiz belongs to the top players in alleg. Currently he holds the nr2 spot as Vet5. This is after 268 games. That puts him +4 points over a Vet1 player like Mitiebean with 208 games played.

Tell me again that you think that 2 mitiebeans (no offense /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ) equal 1 Shiz and a random int4 player? I don't think so. Shiz and Aarm and whoever you might think very good should have much higher ranks (like +12 or something) to even approach their true value. Hence it will take at least 1000 games before they reach such ELO.

Possible solution: reduce points required for rank up (so current vet5=expert6 or something) and build more ranks at the top or number them higher (say until Vet15)
Raveen
Posts: 9104
Joined: Wed Mar 16, 2005 8:00 am
Location: Birmingham, UK
Contact:

Post by Raveen »

Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Nov 13 2006, 05:28 PM) The problem is Flower that currently all vets fall into the Vet1 to Vet6 category even if that assessment of skill was accurate (which it isn't) the numbers are too close together. We really neeed the players spread out over more ranks to accurately reflect the difference in skills between players.
As has already been pointed out, with not enough data the spread of players won't be at its greatest yet. Wait longer, more spread.
Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Nov 13 2006, 05:28 PM) I think we all agree that Shiz belongs to the top players in alleg. Currently he holds the nr2 spot as Vet5. This is after 268 games. That puts him +4 points over a Vet1 player like Mitiebean with 208 games played.
No, it puts him 400-499 points over him.
Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Nov 13 2006, 05:28 PM) Tell me again that you think that 2 mitiebeans (no offense /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ) equal 1 Shiz and a random int4 player? I don't think so. Shiz and Aarm and whoever you might think very good should have much higher ranks (like +12 or something) to even approach their true value. Hence it will take at least 1000 games before they reach such ELO.

Possible solution: reduce points required for rank up (so current vet5=expert6 or something) and build more ranks at the top or number them higher (say until Vet15)
How will that help? The underlying mathematics of the ELO system has nothing to do with people's ranks.
ImageImage
Spidey: Can't think of a reason I'd need to know anything
Post Reply