AllegSkill: Difference between revisions

From FreeAllegiance Wiki
Jump to navigationJump to search
No edit summary
m (→‎How it works: both are modified by sigma)
 
(46 intermediate revisions by 7 users not shown)
Line 1: Line 1:
{{Stub}}
{{AllegSkill}}


AllegSkill is a system for rating the skill of Allegiance players based on their overall performance in-game.  AllegSkill is based on the Trueskill system developed by Microsoft Research (who also developed Allegiance) with some notable additions.  The term 'AllegSkill' is intended to refer to the entire system, which includes additional statistics, and Microsoft Research should not be held responsible for differences when and where the occur.
AllegSkill is a system for rating the skill of Allegiance players based on their overall performance in-game.  AllegSkill is based on the [http://research.microsoft.com/en-us/projects/trueskill/default.aspx Trueskill] system developed by [http://research.microsoft.com/default.aspx Microsoft Research] (who also developed Allegiance) with some notable additions.  The term 'AllegSkill' is intended to refer to the entire system, which includes additional statistics, and Microsoft Research should not be held responsible for differences when and where the occur.  This wiki will deal principally with the technical aspects of the system.  For a layman's explanation of the basics, please see the [http://research.microsoft.com/en-us/projects/trueskill/details.aspx Trueskill] site.


==Technical details==
<!---- [[Image:MuSigmaDifferenceGraph.png]] ---->
==How it works==
You have two numbers keeping track of your rank: [[Mu]], '''&mu;''', and [[Sigma]], '''&sigma;'''. &mu; is your rating (your average skill you've exhibited in all the games you play), and &sigma; is the <i>uncertainty</i> about your rating. After you play a game your &mu; goes up if you win, down if you lose. Your &sigma; always goes down.


What follows is the simplest incarnation of the Trueskill update algorithm, as used for commander ratings.  We've provided as much information as is sensible, and we only assume that the reader is familiar with (or able to look up) the [http://en.wikipedia.org/wiki/Error_function error function] (<math>\text{erf}</math>).
The amount your &mu; and &sigma; go up/down is modified by &sigma; (the more certain the system is about your rank, the less each game will affect it) and by how 'surprising' the game outcome was (a newbie beating a veteran is quite surprising and will have a greater effect on the ranks than if the vet beat the newb).


AllegSkill realises that whether a team wins or loses is highly dependant on the skill of both the commanders and their team, and the algorithms used represent this. Consequently there are separate skill ratings for commanders and pilots.


This scenario pits a newbie commander (<math>\,\!\mu_A = 25;  \sigma_A = 8.333...</math>) against a slightly more experienced commander (<math>\,\!\mu_B = 32;  \sigma_B = 5</math>).


=== Getting ready ===
===Mu & Sigma===


Now let's put some letters and functions on the field:
As mentioned previously, mu and sigma represent the average skill of a player and the uncertainty around that skill respectively.  Uncertainty is a fundamental part of the AllegSkill system, and deserve greater explanation.  We have chosen to plot a graph of three players with different skill ratings:


[[Image:MuSigmaDifferenceGraph.png|right]]
#A <font color="#ff0000">newbie player</font>.
#*Average rating (25) and high uncertainty (8.333).
#A <font color="#0000FF">skilled player that has played only a little</font>
#*Higher rating (36) and moderate uncertainty (4)
#An <font color="#00AA00">average player that has played lots</font>
#*Average rating (25) and low uncertainty (2)


:<math>\beta = \frac{25}{6} </math> is the standard variance around performance
The horizontal axis represents their rating, and the vertical axis is the probability. A simple way of interpreting these graphs is to think that the players "true rank" lies <i>somewhere</i> between the two end points of the curve and that the higher the curve is, the more likely the true rating is at that point.


So, to interpret our three players curves:


:<math>\gamma = \frac{25}{300}</math> is the dynamics variable, which prevents sigma from ever reaching zero
#The red line touches the mu-axis at 0 and 50.  This means that the AllegSkill system believes that the player could have any skill between these two values. 
#*All new players start out with this skill rating.
#The blue line touches 22 and 50, so this player is either just below average, or the best expert in the game.
#The green curve hits the mu-axis between 18 and 32, but is quite likely to be 25.


===Stable uncertainty===
Note that players stabilise with a &sigma; of approximately 1. This would result in a skill distribution even 'tighter' than that represented by the <font color="#00AA00">green</font> player.


:<math>\varepsilon \simeq 0.08</math> is derived empirically from the percentage of games which result in a draw, currently ~1.01%.
By itself, &sigma; would eventually decrease to zero. However '''&gamma;''' is the dynamics variable which prevents sigma from ever reaching zero, which in turn determines how quickly mu can in/decrease once sigma has stabilised. If we discover that sigma-stabilised ratings are moving too slowly to reflect genuine changes in skill, we will increase gamma. It is because of gammas that some player's sigma are less than 1.


===Your rank===


We will use the [http://en.wikipedia.org/wiki/Normal_distribution normal distribution], and more precisely its [http://en.wikipedia.org/wiki/Probability_density_function probability density function] (with a mean of zero and variance of one):
The rank that is displayed in-game is known as your '''Conservative rank'''. Basically it is where your curve touches the axis on the left - the system is 99% sure that your "true rank" isn't any <i>less</i> than your conservative rank.


The formula is
<div align="center"><math>\text{Conservative rank} = \mu - 3 \times \sigma</math></div>


:<math>\text{PDF}(x):=\frac{1}{\sqrt{2\pi}} \cdot e^{-\frac{x^2}{2}}</math>,
This means that newbies, who start with (&mu;, &sigma;) = (25, 8.33) will have a conservative rank of zero. As they play more games the uncertainty about their rank goes down, and so their conservative rank more closely resembles their rating - which will hopefully be close to resembling their actual skill by the time they've played that many games!


====Leaderboard/Ingame ranks====
The ranks shown on leaderboard/in game are multiplied by 0.6, so we have ranks up to 30, instead of ranks up to 50.  This is a temporary measure to ease the process of migrating from Helo's 0-30 system to AllegSkill's 0-50 system.  Once the various non-ASGS components of Allegiance (game servers etc) have been updated to fully support AllegSkill (with additional features such as a new autobalance system), we will switch over to 0-50 ranks across the board.  It is also worth noting that currently the 'rank' column on the leaderboard is also multiplied by 0.6, so should accurately reflect the rank displayed in-game.


We will also use its [http://en.wikipedia.org/wiki/Cumulative_distribution_function cumulative distribution function]:
The formula is
 
<div align="center"><math>\text{Leaderboard rank} = ( \mu - 3 \times \sigma ) \times 0.6</math></div>
 
:<math>\text{CDF}(y)=\frac{1}{2}+\frac{1}{2}\cdot\text{erf}\left( \frac{1}{\sqrt{2}} y\right)</math>.
 
 
=== The update formulas ===
 
The Trueskill update functions follow:
 
:<math>V_{win}(t,\varepsilon) := \frac{ \text{PDF}(t-\varepsilon)}{\text{CDF}(t-\varepsilon) }</math>; this formula is used to update a player's <math>\,\!\mu</math>.
 
 
:<math>W_{win}(t,\varepsilon) := V_{win}(t,\varepsilon) \cdot \left( V_{win}(t,\varepsilon )+t-\varepsilon  \right)</math>; this formula is used to update a player's <math>\,\!\sigma</math>.
 
 
 
:<math>V_{draw}(t,\varepsilon ):=\frac{PDF(-\varepsilon -t)-PDF(\varepsilon -t)}{CDF(\varepsilon -t)-PDF(-\varepsilon -t)}</math>
 
 
 
:<math>W_{draw}(t,\varepsilon ):=V_{draw}^{2}(t,\varepsilon )\cdot \frac{(\varepsilon -t)\cdot PDF(\varepsilon -t)+(\varepsilon +t)PDF(\varepsilon +t)}{CDF(\varepsilon -t)-CDF(-\varepsilon -t)}</math>
 
 
The V and W functions are the core of the Trueskill system, and vary depending on whether the game resulted in a win or a draw.  In both instances, positive values for t represent an unsurprising outcome:  The winner was more skilled than the loser.  Positive values result in the functions returning small values, which in turn result in small <math>\mu</math> and <math>\sigma</math> updates. The converse is also true:  Negative values for t represent a surprising outcome, and result in large updates.
 
 
We also need to introduce a variable that expresses the general uncertainty of the system:
 
 
:<math>c=\sqrt{2\beta ^{2}+\sigma _{w}^{2}+\sigma _{l}^{2}}</math>
 
 
This value is used throughout the calculation.
 
Here is how the formulas are actually used:
 
{| class="wikitable" border="1" cellspacing="0"  cellpadding="5" align="center"
!
! align="center" | Mu
! align="center" | Sigma
|-
! Winner
| align="center" | <math>\mu '_{w}=\mu _{w}+\frac{\sigma _{w}^{2}}{c}\cdot V_{win}\left( \frac{\mu _{w}-\mu _{l}}{c},\frac{\varepsilon }{c} \right)</math>
| align="center" | <math>\sigma '_{w}=\sqrt{\sigma _{w}^{2}\left( 1-\frac{\sigma _{w}^{2}}{c^{2}}\cdot W_{win}\left( \frac{\mu _{w}-\mu _{l}}{c},\frac{\varepsilon }{c} \right) \right)+\gamma ^{2}}</math>
|-
! Loser
| align="center" | <math>\mu '_{l}=\mu _{l}-\frac{\sigma _{l}^{2}}{c}\cdot V_{win}\left( \frac{\mu _{w}-\mu _{l}}{c},\frac{\varepsilon }{c} \right)</math>
| align="center" | <math>\sigma '_{l}=\sqrt{\sigma _{l}^{2}\left( 1-\frac{\sigma _{l}^{2}}{c^{2}}\cdot W_{win}\left( \frac{\mu _{w}-\mu _{l}}{c},\frac{\varepsilon }{c} \right) \right)+\gamma ^{2}}</math>
|}
 
=== AllegSkill in action ===
 
Let's now get our favourite computer assisted algebra system and do the calculations. Let's assume the experienced commander, B, won.
 
 
:<math>c = \frac{5}{6}\sqrt{186}</math>
 
 
:<math>\mu '_{B}=32+\frac{\text{0}\text{.09195636321}\sqrt{186}\sqrt{2}}{\sqrt{\pi }}</math>
 
 
:<math>\,\!\mu '_{B}=\text{33}\text{.00064106}</math>
 
 
:<math>\sigma '_{B}=\sqrt{\frac{3601}{144}-\frac{\text{2}\text{.758690898}\sqrt{2}\left( \frac{\text{0}\text{.5701294519}\sqrt{2}}{\sqrt{\pi }}+\text{0}\text{.04463650105}\sqrt{186} \right)}{\sqrt{\pi }}}</math>
 
 
:<math>\,\!\sigma '_{B}=\text{4}\text{.760851650}</math>
 
 
:<math>\mu '_{A}=25-\frac{\text{0}\text{.2554343423}\sqrt{186}\sqrt{2}}{\sqrt{\pi }}</math>
 
 
:<math>\,\!\mu '_{A}=\text{22}\text{.22044149}</math>
 
 
:<math>\sigma '_{A}=\sqrt{\frac{10001}{144}-\frac{\text{21}\text{.28619519}\sqrt{2}\left( \frac{\text{0}\text{.5701294519}\sqrt{2}}{\sqrt{\pi }}+\text{0}\text{.04463650105}\sqrt{186} \right)}{\sqrt{\pi }}}</math>
 
 
:<math>\,\!\sigma '_{A}=\text{7}\text{.168423552}</math>
 
 
Now let's run the same scenario in reverse, with commander A winning.
 
 
:<math>c = \frac{5}{6}\sqrt{186}</math>
 
 
:<math>\mu '_{A}=25+\frac{\text{0}\text{.6919672626}\sqrt{186}\sqrt{2}}{\sqrt{\pi }}</math>
 
 
:<math>\,\!\mu '_{A}=\text{32}\text{.52977643}</math>
 
:<math>\sigma '_{A}=\sqrt{\frac{10001}{144}-\frac{\text{57}\text{.66393855}\sqrt{2}\left( \frac{\text{1}\text{.544470930}\sqrt{2}}{\sqrt{\pi }}+\text{0}\text{.04568607959}\sqrt{186} \right)}{\sqrt{\pi }}}</math>
 
 
:<math>\,\!\sigma '_{A}=\text{6}\text{.435916375}</math>
 
 
:<math>\mu '_{B}=32-\frac{\text{0}\text{.2491082145}\sqrt{186}\sqrt{2}}{\sqrt{\pi }}</math>
 
 
:<math>\,\!\mu '_{B}=\text{29}\text{.28928048}</math>
 
 
:<math>\sigma '_{B}=\sqrt{\frac{\text{3601}}{144}-\frac{\text{7}\text{.473246435}\sqrt{2}\left( \frac{\text{1}\text{.544470930}\sqrt{2}}{\sqrt{\pi }}+\text{0}\text{.04568607959}\sqrt{186} \right)}{\sqrt{\pi }}}</math>
 
 
:<math>\,\!\sigma '_{B}=\text{4}\text{.623224911}</math>
 


==Related Articles==
==Related Articles==
*[[Why is HELO broken and can it be repaired?]]
*[[Why is HELO broken and can it be repaired?]]




{{AllegSkill}}
{{AllegSkill}}

Latest revision as of 01:05, 19 June 2009

AllegSkill
About: AllegSkill · FAQ · Interim FAQ · Gaining ranks · Whore rating · more...
Technical Details: Commander's ranking · Player's ranking · Stack rating · AllegBalance


AllegSkill is a system for rating the skill of Allegiance players based on their overall performance in-game. AllegSkill is based on the Trueskill system developed by Microsoft Research (who also developed Allegiance) with some notable additions. The term 'AllegSkill' is intended to refer to the entire system, which includes additional statistics, and Microsoft Research should not be held responsible for differences when and where the occur. This wiki will deal principally with the technical aspects of the system. For a layman's explanation of the basics, please see the Trueskill site.

How it works

You have two numbers keeping track of your rank: Mu, μ, and Sigma, σ. μ is your rating (your average skill you've exhibited in all the games you play), and σ is the uncertainty about your rating. After you play a game your μ goes up if you win, down if you lose. Your σ always goes down.

The amount your μ and σ go up/down is modified by σ (the more certain the system is about your rank, the less each game will affect it) and by how 'surprising' the game outcome was (a newbie beating a veteran is quite surprising and will have a greater effect on the ranks than if the vet beat the newb).

AllegSkill realises that whether a team wins or loses is highly dependant on the skill of both the commanders and their team, and the algorithms used represent this. Consequently there are separate skill ratings for commanders and pilots.


Mu & Sigma

As mentioned previously, mu and sigma represent the average skill of a player and the uncertainty around that skill respectively. Uncertainty is a fundamental part of the AllegSkill system, and deserve greater explanation. We have chosen to plot a graph of three players with different skill ratings:

MuSigmaDifferenceGraph.png
  1. A newbie player.
    • Average rating (25) and high uncertainty (8.333).
  2. A skilled player that has played only a little
    • Higher rating (36) and moderate uncertainty (4)
  3. An average player that has played lots
    • Average rating (25) and low uncertainty (2)

The horizontal axis represents their rating, and the vertical axis is the probability. A simple way of interpreting these graphs is to think that the players "true rank" lies somewhere between the two end points of the curve and that the higher the curve is, the more likely the true rating is at that point.

So, to interpret our three players curves:

  1. The red line touches the mu-axis at 0 and 50. This means that the AllegSkill system believes that the player could have any skill between these two values.
    • All new players start out with this skill rating.
  2. The blue line touches 22 and 50, so this player is either just below average, or the best expert in the game.
  3. The green curve hits the mu-axis between 18 and 32, but is quite likely to be 25.

Stable uncertainty

Note that players stabilise with a σ of approximately 1. This would result in a skill distribution even 'tighter' than that represented by the green player.

By itself, σ would eventually decrease to zero. However γ is the dynamics variable which prevents sigma from ever reaching zero, which in turn determines how quickly mu can in/decrease once sigma has stabilised. If we discover that sigma-stabilised ratings are moving too slowly to reflect genuine changes in skill, we will increase gamma. It is because of gammas that some player's sigma are less than 1.

Your rank

The rank that is displayed in-game is known as your Conservative rank. Basically it is where your curve touches the axis on the left - the system is 99% sure that your "true rank" isn't any less than your conservative rank.

The formula is

<math>\text{Conservative rank} = \mu - 3 \times \sigma</math>

This means that newbies, who start with (μ, σ) = (25, 8.33) will have a conservative rank of zero. As they play more games the uncertainty about their rank goes down, and so their conservative rank more closely resembles their rating - which will hopefully be close to resembling their actual skill by the time they've played that many games!

Leaderboard/Ingame ranks

The ranks shown on leaderboard/in game are multiplied by 0.6, so we have ranks up to 30, instead of ranks up to 50. This is a temporary measure to ease the process of migrating from Helo's 0-30 system to AllegSkill's 0-50 system. Once the various non-ASGS components of Allegiance (game servers etc) have been updated to fully support AllegSkill (with additional features such as a new autobalance system), we will switch over to 0-50 ranks across the board. It is also worth noting that currently the 'rank' column on the leaderboard is also multiplied by 0.6, so should accurately reflect the rank displayed in-game.

The formula is

<math>\text{Leaderboard rank} = ( \mu - 3 \times \sigma ) \times 0.6</math>

Related Articles


AllegSkill
About: AllegSkill · FAQ · Interim FAQ · Gaining ranks · Whore rating · more...
Technical Details: Commander's ranking · Player's ranking · Stack rating · AllegBalance