ranking system

Catch-all for all development not having a specific forum.
Raveen
Posts: 9104
Joined: Wed Mar 16, 2005 8:00 am
Location: Birmingham, UK
Contact:

Post by Raveen »

It'll mostly be server code I guess although there may be some client changes needed too (extra buttons for however the autobalance works and so on).

It'll be part of R5 ideally so there'll be other changes being pushed at the same time.
ImageImage
Spidey: Can't think of a reason I'd need to know anything
sgt_baker
Posts: 1510
Joined: Wed Oct 20, 2004 7:00 am
Location: London, UK.
Contact:

Post by sgt_baker »

Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Jan 14 2008, 12:34 PM) So you are seriously considering using trueskill in it's team based form?
Yes.

QUOTE OK then, I thought you might adapt the free-for-all version where you would use the points which the players earned in a game to determine their rank (maybe after modifying the points for win/loss or something)[/quote]

Please see the numerous posts made with regards to the in-game scoring system. Since Trueskill has nothing to do with points-based scoring, adapting the system would, in fact, mean inventing a whole new system.

QUOTE Let me give you my opinion on using trueskill in that form for Alleg and possible implications/ways of cheating. I'm gonna make it point based so it's easier for you to comment on each point. I don't have a problem with trueskill for use in free-for-alls or even small balanced games, however :

1) the way i see trueskill works (correct me if i'm wrong because i don't own a X-Box and can't find how the match making works online) : Peeps only get ranked on win/loss (given the typical 2 team alleg environment) but here's the big catch if i read the documentation correctly (I could be wrong though):

a) Teams consist of equally ranked players so a rank 32 could never join a game with only rank 12's ? (yeah that will work for 200.000 players but creating games only for similarly ranked players in alleg will be a little harder /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> )[/quote]

That's just an arbitrarily chosen method for enforcing balance, and for obvious reasons (as you've correctly stated) is impractical in allegiance. Trueskill is able to track player skills regardless of the skill distrubution within a team, and this is one of the most important features that sets Trueskill aside from other team-based ranking systems.

QUOTE b) Does MS have multiplayer games with 25 per side that use trueskill? Can you calculate how long it will take to get a reasonably accurate rank with 25 per side games when a 8vs8 environment takes 91 games according to MS? (and no: if i'm correct about only similarly ranked players being allowed to play we cannot use old stats)[/quote]

Yes, it's called Allegiance /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> The average team size in allegiance is roughly ten players, so the figure is quite close to the one you posted. (calculated from the 40,000 game ASGS database)

QUOTE 2) could you tell me if X-box live trueskill games usually have 80% of the team join after game start and how you think you can adress early/late game stack changes?[/quote]

Yes. Fractional game-times are taken into account throughout the system. This applies when calculating a team's overall skill and when updating player ranks.

QUOTE 3) in what way will trueskill be different from HELO in that it rewards people for choosing the right team even if they are skill-less ?(connected to the fact that players of vastly different skill play on the same team and with 25 players per side a few slackers who know which team to choose can easily choose a winning team and join them without contributing and yet without hurting that team's chances of succes)[/quote]

The whole idea of having a ranking system is to enable us to organise balanced games. The very definition of a balanced game is one where the outcome is difficult to predict. If a player is able to always join the winning side by means of pre-join prediction, it would imply that the game isn't balanced.

QUOTE 4) in what way wil trueskill reward people that always anti-stack and thus lose more than they should? i used to rank 19th in alleg and was ranked much higher than a @#(!load of players that are much better than me just because they anti-stacked a hell of lot more. a Win/loss system will never fix this without truely balanced teams IMHO[/quote]

Play with MS's trueskill calculator. You will notice that rank updates are proportional to how surprising the game outcome was. If a highly skilled team beats a newb team, the highly ranked team receives a very small rank increase. The converse is also true. The reward for anti-stacking is that if the anti-stack team wins they all receive a very large rank increase. If they lose, the rank penalty will be very small.

QUOTE 5) How will you counter the famous "<5 minute drop doesn't hurt my rank" hellenus trick?[/quote]

Fractional game-times.

QUOTE 6) how will you deal with the fact that only a small% of the players play the whole game and that Game match% sometimes change dramatically during Alleg games because we don't restrict the rank of peeps that enter a game? (related to 1a)[/quote]

Fractional game-times.

QUOTE 7) will you enforce some kind of autobalance?[/quote]

Autobalance will be optional.

QUOTE Once you realize you have problems i can brain storm with you about possible solutions but if you just gonna play the "statistician without a clue for what the real world looks like" then I'd rather not waste time on that, i have to deal with mathematicians/statisticians who never think past/about the restrictions of their models daily and i don't feel like doing that in my spare time too /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />[/quote]

I'll take that comment in good faith.

QUOTE For me trueskill appears to work best in situations where similarly ranked players play 1vs1 or free-for-alls in short games where virtually all players play the full length of the game. It will probably work well in environments with smallish teams (<10) where the team is made up of similarly ranked players. From what i've seen trueskill will not be very good in a Alleg environment because Alleg is a bit too complex for such a "simple" win/loss system /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />[/quote]

And you're basing this assertion on having spent five minutes reading the trueskill website? So much for having spent the last year engaged in empirical analysis. /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />

QUOTE And guys : the fact that I spent time on reading up on the trueskill stuff and bothered to post here means I wanna help you build a good ranking system, try not to be offended if I don't agree with some of the choices you guys have made. Just see me as the devils advocate who critically reviews your work and prevents too much groupthink (BTW I hope not all of you have a background in stats/math because that's a recipe for failure, get a few peeps from other backgrounds too, you'd be surprised that they might bring in fresh ideas and much needed common sense /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> )[/quote]

I'm afraid you're about a year late on this one. /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />
Image
Granary Sergeant Baker - Special Bread Service (Wurf - 13th Oct 2011)
Grim_Reaper_4u
Posts: 356
Joined: Wed Jul 30, 2003 7:00 am
Location: Netherlands

Post by Grim_Reaper_4u »

QUOTE(Grim_Reaper_4u @ Jan 14 2008, 12:34 PM)
So you are seriously considering using trueskill in it's team based form?


Yes.


QUOTE
OK then, I thought you might adapt the free-for-all version where you would use the points which the players earned in a game to determine their rank (maybe after modifying the points for win/loss or something)


Please see the numerous posts made with regards to the in-game scoring system. Since Trueskill has nothing to do with points-based scoring, adapting the system would, in fact, mean inventing a whole new system.

not really IRC trueskill in a 8 player FFA will take the players who scored 1-8 into account based on kills/lap time/etc. essentially within alleg you would rank the players by their points and could then let trueskill calculate their new rank (maybe multiply the winning teams score x2 like in BF2). this way top players would get rewarded much better than in team based trueskill even if they lost. you would however have to come up with a system to accurately rate a players performance in 1 game


QUOTE
Let me give you my opinion on using trueskill in that form for Alleg and possible implications/ways of cheating. I'm gonna make it point based so it's easier for you to comment on each point. I don't have a problem with trueskill for use in free-for-alls or even small balanced games, however :

1) the way i see trueskill works (correct me if i'm wrong because i don't own a X-Box and can't find how the match making works online) : Peeps only get ranked on win/loss (given the typical 2 team alleg environment) but here's the big catch if i read the documentation correctly (I could be wrong though):

a) Teams consist of equally ranked players so a rank 32 could never join a game with only rank 12's ? (yeah that will work for 200.000 players but creating games only for similarly ranked players in alleg will be a little harder )


That's just an arbitrarily chosen method for enforcing balance, and for obvious reasons (as you've correctly stated) is impractical in allegiance. Trueskill is able to track player skills regardless of the skill distrubution within a team, and this is one of the most important features that sets Trueskill aside from other team-based ranking systems.

Can you explain how that works because the whole system like they explain it on the web looks awfully easy to fool (esp with autobalance turned off /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ) You do remember that it is better to gain a few points with stacking than to risk losing a few points by joining the underdog right? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> did you consider that the whole system might rely on the fact that players from both teams are supposed to be of similar rank ?


QUOTE
b) Does MS have multiplayer games with 25 per side that use trueskill? Can you calculate how long it will take to get a reasonably accurate rank with 25 per side games when a 8vs8 environment takes 91 games according to MS? (and no: if i'm correct about only similarly ranked players being allowed to play we cannot use old stats)


Yes, it's called Allegiance The average team size in allegiance is roughly ten players, so the figure is quite close to the one you posted. (calculated from the 40,000 game ASGS database)

as someone who has commed and played a lot of those small games let me tell you that probably 70% or more of them were unbalanced. You know just as well as i do that having a good vet in the early stages of the game beats have a good vet in the later stages of the game (when you have already lost your opening cons and miners and are turtled in your home /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> . hell most of those games start as a 3 vs 3 and don't grow until several minutes later. /laugh.gif" style="vertical-align:middle" emoid=":lol:" border="0" alt="laugh.gif" />


QUOTE
2) could you tell me if X-box live trueskill games usually have 80% of the team join after game start and how you think you can adress early/late game stack changes?


Yes. Fractional game-times are taken into account throughout the system. This applies when calculating a team's overall skill and when updating player ranks.

sorry but nowhere does it say that trueskill takes this into account, do you have a link that describes how this works?

QUOTE
3) in what way will trueskill be different from HELO in that it rewards people for choosing the right team even if they are skill-less ?(connected to the fact that players of vastly different skill play on the same team and with 25 players per side a few slackers who know which team to choose can easily choose a winning team and join them without contributing and yet without hurting that team's chances of succes)


The whole idea of having a ranking system is to enable us to organise balanced games. The very definition of a balanced game is one where the outcome is difficult to predict. If a player is able to always join the winning side by means of pre-join prediction, it would imply that the game isn't balanced.

well Duh, in those 10 vs 10 off prime games i can probably predict 80% of the time which team is gonna win if i watch from noat, even if HELo says the teams are balanced. You should always assume that people will want to stack and will prefer unbalanced games over balanced ones, the whole point of a ranking system is to force people to play balanced games. Your assumption that peeps want to play balanced games is flawed. If you allow unbalanced games to count then you will only duplicate the same problems which HELo currently has

QUOTE
4) in what way wil trueskill reward people that always anti-stack and thus lose more than they should? i used to rank 19th in alleg and was ranked much higher than a @#(!load of players that are much better than me just because they anti-stacked a hell of lot more. a Win/loss system will never fix this without truely balanced teams IMHO


Play with MS's trueskill calculator. You will notice that rank updates are proportional to how surprising the game outcome was. If a highly skilled team beats a newb team, the highly ranked team receives a very small rank increase. The converse is also true. The reward for anti-stacking is that if the anti-stack team wins they all receive a very large rank increase. If they lose, the rank penalty will be very small.

IIRC you aren't telling the whole story : if a person that always stacks and rarely loses his Sd will be so low that he will lose almost no rank/points for losing once in a while because the system will consider it a fluke, hence you are better off stacking and winning (and occasionally losing to some lucky team) than you are in anti-stacking and hoping for a win. If you run the calculator you will see that once you are an established stacker (someone who won his 1st say 10 games then it is very hard to start dropping rank substantially by occasionally losing. (in a 4 vs 4 scenario)


QUOTE
5) How will you counter the famous "<5 minute drop doesn't hurt my rank" hellenus trick?


Fractional game-times.

again how are they implemented since MS doesn't use them i think


QUOTE
6) how will you deal with the fact that only a small% of the players play the whole game and that Game match% sometimes change dramatically during Alleg games because we don't restrict the rank of peeps that enter a game? (related to 1a)


Fractional game-times.

if only it were that easy, this reduces a late joiner stack impact on rank-increase but it doesn't reflect the true value which some players have even if they join for only 20 minutes out of a 2 hour game. there are times i'd trade 3/4 of my team for soemone who could drive a htt or drop a tp2 and having such a person join often turnjs the tides of the game regardless of what trueskill thinks /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />


QUOTE
7) will you enforce some kind of autobalance?


Autobalance will be optional.

Bad idea!! This will again encourage stacks where peeps prefer a small point gain win over a balanced but uncertain larger point gain. It is bad enough that us vets can distuinguish which team is better even with equal HELO because we know not all peeps with the same rank are created equal /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> but you would compound the likelyhood of unequal games by allowing autobalance to be turned off. Ideally you should fix it so players of the same ranks get equally distributed over the teams


QUOTE
Once you realize you have problems i can brain storm with you about possible solutions but if you just gonna play the "statistician without a clue for what the real world looks like" then I'd rather not waste time on that, i have to deal with mathematicians/statisticians who never think past/about the restrictions of their models daily and i don't feel like doing that in my spare time too


I'll take that comment in good faith.


QUOTE
For me trueskill appears to work best in situations where similarly ranked players play 1vs1 or free-for-alls in short games where virtually all players play the full length of the game. It will probably work well in environments with smallish teams (<10) where the team is made up of similarly ranked players. From what i've seen trueskill will not be very good in a Alleg environment because Alleg is a bit too complex for such a "simple" win/loss system


And you're basing this assertion on having spent five minutes reading the trueskill website? So much for having spent the last year engaged in empirical analysis.

You might want to consider that applying new rules to old data has it's inherent problems. as soon as peeps know the new rules their behaviour might change and thus the new data might be drastically different from the old one /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />

QUOTE
And guys : the fact that I spent time on reading up on the trueskill stuff and bothered to post here means I wanna help you build a good ranking system, try not to be offended if I don't agree with some of the choices you guys have made. Just see me as the devils advocate who critically reviews your work and prevents too much groupthink (BTW I hope not all of you have a background in stats/math because that's a recipe for failure, get a few peeps from other backgrounds too, you'd be surprised that they might bring in fresh ideas and much needed common sense )


I'm afraid you're about a year late on this one.

and finally : according to MS it takes 2 Teams/4 Players per Team 10 consecutive game wins before they reach a new rank once their rank has settled with enough games (with ranks 1-50) now wouldn't that mean you would need a @#(!load of consecutive wins to go up 1 rank in a 10vs10 environment (imagine how long it would take for those players who only play primetime 25vs25) ? So if we are lucky and games are balanced and everyone loses 50% of his games (which is essentially what balance means no?) then you will rank up like eh .... once a year after the initial rush to a mid level rank? /laugh.gif" style="vertical-align:middle" emoid=":lol:" border="0" alt="laugh.gif" /> /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />
sgt_baker
Posts: 1510
Joined: Wed Oct 20, 2004 7:00 am
Location: London, UK.
Contact:

Post by sgt_baker »

QUOTE not really IRC trueskill in a 8 player FFA will take the players who scored 1-8 into account based on kills/lap time/etc. essentially within alleg you would rank the players by their points and could then let trueskill calculate their new rank (maybe multiply the winning teams score x2 like in BF2). this way top players would get rewarded much better than in team based trueskill even if they lost. you would however have to come up with a system to accurately rate a players performance in 1 game[/quote]

You see my point wrt. having to invent a whole new system. Scoring for in-game actions is notoriously difficult to get right. Being able to create such a system might mitigate the need for trueskill in the first place.

QUOTE Can you explain how that works because the whole system like they explain it on the web looks awfully easy to fool (esp with autobalance turned off ) You do remember that it is better to gain a few points with stacking than to risk losing a few points by joining the underdog right? did you consider that the whole system might rely on the fact that players from both teams are supposed to be of similar rank ?[/quote]

Trueskill categorically *does not* rely on the teams being composed of similarly ranked players. I've provided you with a plethora of information regarding the functioning of trueskill. I've got better things to do that spoon-feed this to every person who demands it (you're not the first). The stacking issue isn't a problem with the ranking system, it's a problem with player behaviour. Stacking will happen regardless of the specific ranking system used. Trueskill goes a long way towards mitigating the effects of stacing, but cannot magically remove the problem without enforcing auto balance.

QUOTE as someone who has commed and played a lot of those small games let me tell you that probably 70% or more of them were unbalanced. You know just as well as i do that having a good vet in the early stages of the game beats have a good vet in the later stages of the game (when you have already lost your opening cons and miners and are turtled in your home . hell most of those games start as a 3 vs 3 and don't grow until several minutes later.[/quote]

Your point is? Regardless, the average game size in alleg is 10 vs 10.

QUOTE sorry but nowhere does it say that trueskill takes this into account, do you have a link that describes how this works?[/quote]

It's in the technical report. 'a' in table 1, row 3.

QUOTE well Duh, in those 10 vs 10 off prime games i can probably predict 80% of the time which team is gonna win if i watch from noat, even if HELo says the teams are balanced. You should always assume that people will want to stack and will prefer unbalanced games over balanced ones, the whole point of a ranking system is to force people to play balanced games. Your assumption that peeps want to play balanced games is flawed. If you allow unbalanced games to count then you will only duplicate the same problems which HELo currently has[/quote]

I'm not making that assumption. I am, however, assuming that the community will have a fit if we enforce mandatory auto balance. I could be wrong, so I'll put up a poll in general.

QUOTE You might want to consider that applying new rules to old data has it's inherent problems. as soon as peeps know the new rules their behaviour might change and thus the new data might be drastically different from the old one[/quote]

Yes, I'm aware of this. It would be foolish to ignore gameplay patterns altogether and just base ones decisions on arbitrary assumptions /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

QUOTE and finally : according to MS it takes 2 Teams/4 Players per Team 10 consecutive game wins before they reach a new rank once their rank has settled with enough games (with ranks 1-50) now wouldn't that mean you would need a @#(!load of consecutive wins to go up 1 rank in a 10vs10 environment (imagine how long it would take for those players who only play primetime 25vs25) ? So if we are lucky and games are balanced and everyone loses 50% of his games (which is essentially what balance means no?) then you will rank up like eh .... once a year after the initial rush to a mid level rank?[/quote]

I think you're confusing a skill rating with a reward rating. The whole point of Trueskill is to rapidly converge on a player's true skill rating, then remain relatively stable. This is the exact reason we propose keeping track of points-based stats in AllegSkill MkII. Whilst the points-based system is pretty hopeless for accurately measuring skill and enforcing balance, it provides a system of reward for the player. I recognise the value of such reward in a competitive environment.
Image
Granary Sergeant Baker - Special Bread Service (Wurf - 13th Oct 2011)
Gandalf2
Posts: 3943
Joined: Wed Oct 13, 2004 7:00 am
Location: W. Midlands, UK

Post by Gandalf2 »

I confess not to having read everythingg. But from playing with the calculator, it seems the amount of stackage required to make a single anti-stacker's uncertainty go up, is large.

Eg, say 3 vets and a noob, are playing 4 voobs (pretty poor ones). I join the voobs for 15 minutes, we lose. I would expect my rank to go down slightly, but my uncertainty to go up - as there is nothing to learn from such a game. Am I right here? What would be the "tipping point" for this? Will that question be answered in your long, long-awaited publication? (two weeks, isn't it? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />)
Image
Image
spideycw - 'This is because Grav is a huge whining bitch. But we all knew that already' Dec 19 2010, 07:36 PM
sgt_baker
Posts: 1510
Joined: Wed Oct 20, 2004 7:00 am
Location: London, UK.
Contact:

Post by sgt_baker »

Gandalf2 wrote:QUOTE (Gandalf2 @ Jan 15 2008, 12:19 AM) (two weeks, isn't it? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />)
Yeah, sorry about this. A load of RL stuff.. same old story. /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />
Image
Granary Sergeant Baker - Special Bread Service (Wurf - 13th Oct 2011)
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

Grim_Reaper_4u wrote:QUOTE (Grim_Reaper_4u @ Jan 14 2008, 02:47 AM) relax chaos, i'm not here to get on your tits (unless they look a lot better than I imagine they do) /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> actually 50% of my job involves data-mining and doing statistical analyses of data so I'm not a complete newb at this /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> (the dozens of academic papers i did the analyses for have been published in A journals worldwide) If you tell us exactly what your system looks like and what it uses for it's rank than I'll be more than happy to shoot holes in your theory (backed up with solid reasons why it is flawed of course /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ).

Since currently i'm only guessing what your system looks like it's quite useless to comment on it besides the aforementioned generalizations (which i actually do stand by and can defend)

So take my comments as friendly advice and do with it what you like. /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />
Shennanigans

Try reading the academic paper then Grim and not the TrueSkills is cool bit for the everyday peeps. Baker can help you when you get stuck on the hard bits /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> /laugh.gif" style="vertical-align:middle" emoid=":lol:" border="0" alt="laugh.gif" />
Ssssh
Vlymoxyd
Posts: 985
Joined: Fri Jul 04, 2003 7:00 am
Location: Québec, Canada
Contact:

Post by Vlymoxyd »

I had some stats courses both in college and university, but I'm still nowhere like an expert.

I know my opinion is probably not worth much, but here's some though:

- I have no problems to beleive that a system could find out decent ranks for players even if games were not balanced, but I have serious doubts it would not wrongly reward people who stacks on purpose.
There's just a huge difference between having someone who always chooses to be on an unbalanced and better/worse team and a player who randomly ends up playing in unbalanced games.
People says that using wins/losses ends up picking a player's ability to probe, kill enemies, bomb or even command, but imo, it would also pick up the ability to stack(which is a "skill" that will make a player win more games).

Imo, the results would be that stackers would have a higher rank than they should and a stacker with little skill could be forced to play with the lesser team.

With that said, I have the impression that an advantage of the system(if my assumption was right) would be that it would even out the right to stack among players.
One of the disadvantage would be that a stack would be encouraged to lose. Imagine an important stack against a bunch of complete newbies. If the stack lose, every stackers would be awarded a massive point loss, which would allow them all to stack more in the future. Some players already used this tactic in the past. Obviously, it's a disadvantage that is true for any system.

I'm also concerned about the number of games required before ranks becomes meaningfull. Imo, the system would work if players didn't get to choose their teamates and that they all played 1 billion games, but I don't know when the ranks will start being correct.

As for points, I played on AZ and I can tell how bad of an idea they would be for rank.
Imo, they made some sense in the days because high ranks meant more games, more usefull actions(points) and more win(More points for wins), the most important being the number of games played. I think that after 8 years, the number of games player doesn't really matter anymore and thus, points would be useless.

Finally, I'd like to say that we don't need a perfect system, as long as we can trust AB to create decent games most of the time, I think the system will have done its job.
"Désolé pour les skieurs, moi je veux voir mes fleurs!"
-German teacher

Image
http://www.steelfury.org/
Grim_Reaper_4u
Posts: 356
Joined: Wed Jul 30, 2003 7:00 am
Location: Netherlands

Post by Grim_Reaper_4u »

Vlymoxyd wrote:QUOTE (Vlymoxyd @ Jan 19 2008, 07:45 AM) Finally, I'd like to say that we don't need a perfect system, as long as we can trust AB to create decent games most of the time, I think the system will have done its job.
Yup but that would mean we always use AB (or at least always use it for ranked games). If we don't then getting a high rank by stacking is very easy and trueskill might as well be called TrueStackSkill /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> eg. with AB off (and likely even with it on) i could statistically gain vet status as a newb without ever leaving base just by joining a team Aarm comms because he wins like 90% of the games he comms and my presence has 0% effect on the outcome until i reach a very high rank. Baker always says this isn't a problem with the ranking system but with player behaviour but this is a bit silly. If you don't take into account the behaviour opf players when building a ranking system then you might as well stop building the system. Even MS recognized the problems of stacking and Trueskill and solved this by : a) random teams b) only players of similar rank get assigned to the same team. They did this primarily to keep stackers in check, so maybe there is a lesson to be learned there /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> Hint Hint /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />

On a side note : name 1 game by MS (or any other publisher) that uses trueskill and has :
1) commanders that can affect gameplay severly
2) potentially huge problems for 1 team when commanders pick the wrong tech/settings/map
3) vastly different techs between the teams

I don't play X-Box or game on Live so I'm not sure but I think there are none right? /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" />
sgt_baker
Posts: 1510
Joined: Wed Oct 20, 2004 7:00 am
Location: London, UK.
Contact:

Post by sgt_baker »

Vlymoxyd wrote:QUOTE (Vlymoxyd @ Jan 19 2008, 06:45 AM) - I have no problems to beleive that a system could find out decent ranks for players even if games were not balanced, but I have serious doubts it would not wrongly reward people who stacks on purpose.
There's just a huge difference between having someone who always chooses to be on an unbalanced and better/worse team and a player who randomly ends up playing in unbalanced games.
People says that using wins/losses ends up picking a player's ability to probe, kill enemies, bomb or even command, but imo, it would also pick up the ability to stack(which is a "skill" that will make a player win more games).
You're quite right. If the system is presented with a sequence of stacked games when player sigmas are generally high (such as the sutuation where a system first goes live) it will indeed start to rate stacking as a positive game-winning attribute. We have spent a significant amount of time mitigating this problem. Currently there are two options for implementing the system without stack-munged ranks: A complete stats reset and mandatory autobalance until the top 300ish player's ranks have lower-than-newb sigmas, or use the pre-calculated ranks based on our system for detecting and removing stacked games. Guess which I'm in favour of? /smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

QUOTE Imo, the results would be that stackers would have a higher rank than they should and a stacker with little skill could be forced to play with the lesser team.

With that said, I have the impression that an advantage of the system(if my assumption was right) would be that it would even out the right to stack among players.
One of the disadvantage would be that a stack would be encouraged to lose. Imagine an important stack against a bunch of complete newbies. If the stack lose, every stackers would be awarded a massive point loss, which would allow them all to stack more in the future. Some players already used this tactic in the past. Obviously, it's a disadvantage that is true for any system.

I'm also concerned about the number of games required before ranks becomes meaningfull. Imo, the system would work if players didn't get to choose their teamates and that they all played 1 billion games, but I don't know when the ranks will start being correct.[/quote]

Numerous players have more than the ~100 games (on average) required for accurate ranks. Grim's suppositin that the ASGS DB is full of junk data is simply not true, and I intend to use this data to speed our progression towards accurate ranks for everyone.

QUOTE As for points, I played on AZ and I can tell how bad of an idea they would be for rank.
Imo, they made some sense in the days because high ranks meant more games, more usefull actions(points) and more win(More points for wins), the most important being the number of games played. I think that after 8 years, the number of games player doesn't really matter anymore and thus, points would be useless.

Finally, I'd like to say that we don't need a perfect system, as long as we can trust AB to create decent games most of the time, I think the system will have done its job.[/quote]

There is no such thing IMO. The closest to perfet that we have atm are systems that try to rate a player, but also admit to their own inaccuracy. Trueskill, Glicko and Glicko 2 all do this, but Trueskill is the only system with explicit handling of teams, hence our chosing to implement it.

/smile.gif" style="vertical-align:middle" emoid=":)" border="0" alt="smile.gif" />

B
Image
Granary Sergeant Baker - Special Bread Service (Wurf - 13th Oct 2011)
Post Reply