The Massively Gigantic Thread On Elo Comments

Allegiance discussion not belonging in another forum.
Grim_Reaper_4u
Posts: 356
Joined: Wed Jul 30, 2003 7:00 am
Location: Netherlands

Post by Grim_Reaper_4u »

ELO SUCKS /ninja.gif" style="vertical-align:middle" emoid=":ninja:" border="0" alt="ninja.gif" />

without correct implementation of the following :

- only allow evenly commed games to start (ELO comm modifier does not work : vet backseat comm tells (1) comm exactly what to do and team gets a huge ELO win /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ) It is still exploitable but better than nothing.

- ELO only counts for games that starts balanced with balance "enforcing" enabled (ending a game with even ELO doesn't mean the critical 1st 15 minutes were even /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> )

- create a formula to prevent players of equal ELO (but vastly different skills) to create unbalanced teams, eg:
a) assign top 20 players in Alleg a special (hidden) tag which allows equal distribution of scout ho's, int ho's, sf ho's etc. across the teams. The fact that you have a similar ELO doesn't mean you have the same skill or the same worth for a team : Snack has the same ELO as DNOdjay but in picks Snack will be picked way before DNO. Vandal has the same ELO as Weed and Frag, now who would you like to have on your team? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />
B) other solutions: take kills/minute or kills/game into account, same as miner kills/game or base kills/game and use it to distribute talent across teams more evenly than ELO can.

- Disable ELO for any game that starts with less than 10 players on each side (do you really want all these newb DM's or games that start 2 vs 2 but grow into a 10vs10 to count?) /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" /> Results from these game screw up ELO reliability and advance newbs and small game players way too fast up the ELo ladder. Do you really think that Jimmy's 5 vs 5 Giga Tac, mine drone, carrier, con spam should count towards ELO? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

Before beginning I want to say CAPS are to emphasis a point not yell. Thanks Bad for discussing.
badpazzword wrote:QUOTE (badpazzword @ Jul 14 2006, 02:05 AM) Trueskill, at least in Microsoft's incarnation, is pretty much useless to Allegiance.
1) The variations of the rank are massive at the beginning and grow smaller and smaller. This means your performance as a (0) pretty much determines the final outcome even though you've lost your number.
Well that is EXACTLY what you want. Small and smaller varition over time. This isn't an accumulated points based rank it's a SKILL ranking. Variation equals uncertainity in ranking .

QUOTE 2) Trueskills, unlike ELO, is strongly affected by the size of the game. That is, the bigger the teams, the slower the change. Quoting the site, you need 91 16v16 games to have your skill determined.[/quote]Well Elo is too but in a different way. Also Elo assigns POINTS 32 max per game. If you get max points every game never losing a single one it takes 27 games to just lose NewbTag.

You got to get the @#(! stacked out of you too in order to get max points I believe. Elo REWARDS rookie stacking.

TrueSkills uses player mean and skill uncertainty to assign rank. The reason for so many games is to achieve the 99.9% confidence.

If you win 27 games in a row your skill based rank will absolutely give us a better number.

QUOTE 3) Trueskill assumes every player gives the same contribution to the outcome. Which is wrong as it's the comm who mostly decides it.[/quote]

True it assumes equal skill. Comm first among equals as you contend. Debate aside a seperate Comm skill rating could be done. Im leery but it might work. A leadership call.

Otherwise it uses a probability method to decide balance and after the game assigns changes INDIVIDUALLY to each player.

QUOTE 4) Trueskill doesn't compare the strenght of the teams. That means, it's stacker's heaven.[/quote]Eeeeer you say it does look at team skill right above which is true. TrueSkill looks at team skill.

QUOTE What about separating player's and commander's elo? A player and a commander use different skills and have different impact on the outcome. So it wouldn't be absurd.

Badp[/quote]

answered

MrChaos
Ssssh
jgbaxter
Posts: 2181
Joined: Mon Apr 25, 2005 7:00 am

Post by jgbaxter »

\mrchaos; you're fighting for your prefered system and just spreading mud on what you don't agree with, so discussion becomes difficult. /tongue.gif" style="vertical-align:middle" emoid=":P" border="0" alt="tongue.gif" />
n.b. I may not see a forum post replied to me or a pm sent to me for weeks and weeks...
badpazzword
Posts: 3627
Joined: Thu Jan 12, 2006 8:00 am
Contact:

Post by badpazzword »

Thanks to you, MrChaos /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />

Back to the point.
ELO rewards newbie stack? Provided a game does count, the more the stack the less the reward for the losing newbies. winning against a newbie stack? If 10 newbies can outrun 10 vets, then they damn deserve the reward, don't they?

2) TrueSkill is easily exploitable. I get a nick running, I lose the number and learn the basics. I take care to chat as few as possible so that I won't be recognisable later. I get another nick on another computer, so that ASGS won't get me. Or, if accounts get linked, I ask Pook to unlink them. I restart from (0). The result? The second nick will have a far better score than I'd have had if I kept on my previous nick, since the first decisive games will have been much better. Especially if I go whore in the newbie servers... Try do that with ELO.

3) True skill cannot be used to compute the number, unless you want to tie sigma to it. But then it is not the outcome of a game that helps your number, but only how many players there were.

4) TrueSkill is a sort of prejudicial system. When it decides your score is that, it will be hard to budge it without breaking, say, half of the ToS with multiple unlinked nicks. This is because he it decides you suck, 20 great games will hardly change the decision because the sigma by then will be so small.

5) Moreover, TrueSkill won't give it a damn if I piloted a bomb run that blew up all the enemy bases in a row or if I flew carelessly and cluelessly around feeding their KB. It only gets the final outcome (quoted).

6) What's worse, TrueSkill does not mind whether there was a stack or not. 10 newbies beat 10 vets? TrueSkill just ignores that. ELO does not.

Imho, it is mostly stack that decides the outcome of a game, because even games are rare. If an algorithm doesn't take care of the key factor, well, it is pretty much useless.

Badp
Have gaming questions? Get expert answers! Image Image
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

Baxter

hmmmmmm
Please don't start with accusations.
It might seem like it to you but Im not slinging mud.

My favoritism plays no part in looking at the two. I LIKE a bitwise approach were losers give data bits to winners. Tickles my nerd is all

Please read the four things so we can discuss TrueSkills vs Elo.

Bad

1. yes they deserve rewarding big time. Elo does a poor job only 32 points. These amazing newbs do it 25 times more they all still tagged as 8s not with TrueSkill.

2. Please understand this one it's important. IT'S SKILL BASED. I cannot address the details but if it's an issue you found Pook will need to discuss.

3. No brother it's based on your win/lose record and your opponent's skill level rating

4. It depends. You got Vandal level win/ lose record 20 games may move you little. Even then if your winning against the stack your skill level ranking moves. Again ASGS details are Pooks domain

5. Elo doesn't catch it either. It's not perfect just better.

6. It so does give a $#@! and reacts STRONGLY to stacking. You repeatedily stack even winning your ranking can DROP.

Please read about the four things, click around MSR's site. Follow the math. It will make sense that it's better.

I got no attachment to it other then it works better for addressing auto-balance and stacking. According to my reading and farting around.

The implementation can be handled any number of ways *shrug* thank goodness it ain't my call on the details

MrChaos
Ssssh
Elephanthead
Posts: 211
Joined: Sat Jul 05, 2003 7:00 am

Post by Elephanthead »

How about this,

#1 team paydays are based upon ELO ranking, team with the lower ELO gets higher payday. If you made interceptors faster for the team with lower ELO you would see some autobalnceing happening.

#2 If a noobie and a vet are requesting a noobie shielded team at the same time, if the team is down in ELO only the vet can be accepted, or if up only the noobie.

#3 you jerks will find a way to ruin games no matter what poor pook does so send him pictures of Lima and jars of honey to help him make it through another day.
Freeza
Posts: 304
Joined: Sat Mar 05, 2005 8:00 am
Location: Northeast, U.S.A.

Post by Freeza »

I don't get the point of ranks. Most people ignore any ranking system after a month anyway, so why not put development time to better use?

ELO is meaningless, so what should be done? Scrap it and just do age ranks. Just have a leaderboard for player kills, kill ratio, base kills, or whatever else people want. If people want to see how 'good' they are and puff out their chest, they can.

Balancing games by ELO won't ever work, nor will any other system. People judge people better than computers can judge people.

The main problem today, in alleg, isn't balance; it's people listening to commanders. Depending on the situation that can be the player's and/or the commander's fault. So if you want a ranking system, do it FOR COMMANDERS and not the masses. That will be more useful, even though I still think players can and will judge commanders better than a computer ever will.
Image
To Punish and Enslave...
Ksero
Posts: 87
Joined: Sun Feb 15, 2004 8:00 am
Location: Sweden

Post by Ksero »

Elephanthead wrote:QUOTE (Elephanthead @ Jul 14 2006, 05:11 PM) #1 team paydays are based upon ELO ranking, team with the lower ELO gets higher payday. If you made interceptors faster for the team with lower ELO you would see some autobalnceing happening.
It's an interesting idea... but it'll be a nightmare to balance... Now we have balancing issues between large and small games. If this was implemented, it would be "Newbie Belter Tac is overpowered vs. Stacked Bios Exp in big games", roughly doubling the number of possible problems. And it would be a core file format change. And... /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />
"Better than Light Booster 1"
Ksero
Posts: 87
Joined: Sun Feb 15, 2004 8:00 am
Location: Sweden

Post by Ksero »

Freeza wrote:QUOTE (Freeza @ Jul 14 2006, 05:29 PM) Balancing games by ELO won't ever work, nor will any other system. People judge people better than computers can judge people.
Yes... if we could hire a team of judges that could watch over all the games played and assign grades to all players, that would probably be more accurate. But we can't. And though our community is still small by online standards, it's still sufficiently large to make remembering the individual skills of all or even most of the active players unfeasible. And then you have the hider nicks... even more names to remember.
Computers, on the other hand, are good at number crunching. All I want is some help in estimating how good other players are. ELO might be far from perfect, but it's still better than nothing.
Last edited by Ksero on Fri Jul 14, 2006 5:02 pm, edited 1 time in total.
"Better than Light Booster 1"
Terralthra
Posts: 1748
Joined: Fri Nov 18, 2005 8:00 am
Location: San Francisco, CA, USA

Post by Terralthra »

Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Jul 14 2006, 06:23 PM) - create a formula to prevent players of equal ELO (but vastly different skills) to create unbalanced teams, eg:
a) assign top 20 players in Alleg a special (hidden) tag which allows equal distribution of scout ho's, int ho's, sf ho's etc. across the teams. The fact that you have a similar ELO doesn't mean you have the same skill or the same worth for a team : Snack has the same ELO as DNOdjay but in picks Snack will be picked way before DNO. Vandal has the same ELO as Weed and Frag, now who would you like to have on your team? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />
Errr, the whole point of Elo and autobalancing and such is that eventually, those that have stacked themselves to a higher Elo will drop in rating as their performance in an even game doesn't match their rating...
Post Reply