AllegSkill: Difference between revisions

From FreeAllegiance Wiki
Jump to navigationJump to search
(→‎The update formulae: Added line about unkowns and small samples in-game.)
Line 9: Line 9:
The amount they go up/down is modified by σ - the more certain the system is about your rank, the less each game will affect it - and by how 'surprising' the game outcome was - a newbie beating a veteran is quite surprising and will have a greater effect on the ranks than if the vet beat the newb.
The amount they go up/down is modified by σ - the more certain the system is about your rank, the less each game will affect it - and by how 'surprising' the game outcome was - a newbie beating a veteran is quite surprising and will have a greater effect on the ranks than if the vet beat the newb.


AllegSkill realises that whether a team wins or loses is highly dependant on the skill of the commander, and the algorithms used represent this, and there is even a different system for commanders and pilots.
AllegSkill realises that whether a team wins or loses is highly dependant on the skill of both the commanders and their team, and the algorithms used represent this. Consequently there are separate skill ratings for commanders and pilots.
 


==Technical details==
==Technical details==

Revision as of 23:33, 13 October 2008

Edit.png
Stub This article is incomplete

This is an article about a topic that should be covered in more detail by the FreeAllegiance Wiki, but is lacking in content.

You can help by improving it!


AllegSkill is a system for rating the skill of Allegiance players based on their overall performance in-game. AllegSkill is based on the Trueskill system developed by Microsoft Research (who also developed Allegiance) with some notable additions. The term 'AllegSkill' is intended to refer to the entire system, which includes additional statistics, and Microsoft Research should not be held responsible for differences when and where the occur.

How it works

You have two numbers keeping track of your rank: Mu, μ, and Sigma, σ. μ is your rating, and σ is the uncertainty about your rating. After you play a game your μ goes up if you win, down if you lose and your σ goes down.

The amount they go up/down is modified by σ - the more certain the system is about your rank, the less each game will affect it - and by how 'surprising' the game outcome was - a newbie beating a veteran is quite surprising and will have a greater effect on the ranks than if the vet beat the newb.

AllegSkill realises that whether a team wins or loses is highly dependant on the skill of both the commanders and their team, and the algorithms used represent this. Consequently there are separate skill ratings for commanders and pilots.

Technical details

What follows is the simplest incarnation of the Trueskill update algorithm, as used for commander ratings. We've provided as much information as is sensible, and we only assume that the reader is familiar with (or able to look up) the error function (<math>\text{erf}</math>).

The update formulae

After the game is played the commanders' ranks are updated as follows:

Mu, μ Sigma, σ
Winner <math>\mu'_w=\mu_w+\frac{\sigma_w^{2}}{c}\cdot V_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right)</math> <math>\sigma'_w=\sqrt{\sigma_w^{2}\left( 1-\frac{\sigma_w^{2}}{c^{2}}\cdot W_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right) \right)+\gamma ^{2}}</math>
Loser <math>\mu'_l=\mu_l-\frac{\sigma_l^{2}}{c}\cdot V_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right)</math> <math>\sigma'_l=\sqrt{\sigma_l^{2}\left( 1-\frac{\sigma_l^{2}}{c^{2}}\cdot W_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right) \right)+\gamma ^{2}}</math>
Draws Placeholder Placeholder

Where β, γ and ε are constants:

<math>\beta = \frac{25}{6} </math> is the standard variance around performance
<math>\gamma = \frac{25}{300}</math> is the dynamics variable, which prevents sigma from ever reaching zero
<math>\varepsilon \simeq 0.08</math> is derived empirically from the percentage of games which result in a draw, currently ~1.01%.


c is a variable that expresses the general uncertainty of the system:

<math>c=\sqrt{2\beta ^{2}+\sigma_w^{2}+\sigma_l^{2}}</math>


and Vwin and Wwin are TrueSkill functions based on

  1. The normal distribution (with a mean of zero and variance of one), and more precisely its probability density function.
    <math>\text{PDF}(x):=\tfrac{1}{\sqrt{2\pi}} \text{e}^{-\frac{x^2}{2}}</math>
  2. The cumulative distribution function of the normal distribution:
    <math>\text{CDF}(y):=\tfrac{1}{2}\left(1 + \text{erf}\left( \tfrac{1}{\sqrt{2}} y\right)\right)</math>.

No idea what they are? Don't worry, they are just scary maths that stats dudes use to try and represent a large group of unknowns based on a small number of samples. In our case the unknows represent every tiny detail of your in-game actions, and the samples are the game outcomes.

TrueSkill functions

Developed by Microsoft, the Trueskill update functions are:

<math>V_\text{win}(t,\varepsilon) := \frac{ \text{PDF}(t-\varepsilon)}{\text{CDF}(t-\varepsilon) }</math>


<math>W_\text{win}(t,\varepsilon) := V_\text{win}(t,\varepsilon) \cdot \left( V_\text{win}(t,\varepsilon )+t-\varepsilon \right)</math>


There are also two special versions of <math>V</math> and <math>W</math> when draws take place.

<math>V_\text{draw}(t,\varepsilon ):=\frac{\text{PDF}(-\varepsilon -t)-\text{PDF}(\varepsilon -t)}{\text{CDF}(\varepsilon -t)-\text{CDF}(-\varepsilon -t)}</math>


<math>W_\text{draw}(t,\varepsilon ):=V_\text{draw}^{2}(t,\varepsilon )\cdot \frac{(\varepsilon -t)\cdot \text{PDF}(\varepsilon -t)+(\varepsilon +t)\text{PDF}(\varepsilon +t)}{\text{CDF}(\varepsilon -t)-\text{CDF}(-\varepsilon -t)}</math>


So, what are t and ε used in these formula? Well, if you know your maths, you realise they can be anything that you choose to 'pass into' the function. In our case we are letting

<math>t = \frac{\mu_w-\mu_l}{c}</math>       and       <math>\varepsilon = \frac{\varepsilon }{c}</math>


The <math>V</math> and <math>W</math> functions are the core of the Trueskill system, and vary depending on whether the game resulted in a win or a draw. In both instances, positive values for <math>t</math> represent an unsurprising outcome: the winner was more skilled than the loser. Positive values result in the functions returning small values, which in turn result in small <math>\mu</math> and <math>\sigma</math> updates. The converse is also true: Negative values for <math>t</math> represent a surprising outcome, and result in large updates.


AllegSkill example

This scenario pits a newbie commander (<math>\,\!\mu_A = 25; \sigma_A = 8.333...</math>, i.e. normal rank, high uncertainty) against a slightly more experienced commander (<math>\,\!\mu_B = 32; \sigma_B = 5</math>, i.e. high rank, medium uncertainty).

Number crunching

Let's now get our favourite computer assisted algebra system and do the calculations. Let's assume the experienced commander, B, won.


<math>c = \tfrac{5}{6}\sqrt{186} \simeq 11.4</math>


<math>\mu '_{B}=32+\tfrac{0.09195636321\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 33.0</math>


<math>\sigma '_{B}=\sqrt{\tfrac{3601}{144}-\tfrac{2.758690898\sqrt{2}\left( \tfrac{0.5701294519\sqrt{2}}{\sqrt{\pi }}+0.04463650105\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 4.8</math>


<math>\mu '_{A}=25-\tfrac{0.2554343423\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 22.2</math>


<math>\sigma '_{A}=\sqrt{\tfrac{10001}{144}-\tfrac{21.28619519\sqrt{2}\left( \tfrac{0.5701294519\sqrt{2}}{\sqrt{\pi }}+0.04463650105\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 7.2</math>


Now let's run the same scenario in reverse, with commander A winning.


<math>c = \tfrac{5}{6}\sqrt{186} \simeq 11.4</math>


<math>\mu '_{A}=25+\tfrac{0.6919672626\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 32.5</math>


<math>\sigma '_{A}=\sqrt{\tfrac{10001}{144}-\tfrac{57.66393855\sqrt{2}\left( \tfrac{1.544470930\sqrt{2}}{\sqrt{\pi }}+0.04568607959\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 6.4</math>


<math>\mu '_{B}=32-\tfrac{0.2491082145\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 29.3</math>


<math>\sigma '_{B}=\sqrt{\tfrac{3601}{144}-\tfrac{7.473246435\sqrt{2}\left( \tfrac{1.544470930\sqrt{2}}{\sqrt{\pi }}+0.04568607959\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 4.6</math>

Results

Let's compare the possible scenarios. Data is shown in the (mu, sigma) form.

Commander A
(Newbie)
Commander B
(Vet)
Before the game (25.0 , 8.33)
Rank: (0)
(32.0 , 5.0)
Rank: (17)
Commander A wins (32.5 , 6.4)
Rank: (13)
(29.3 , 4.6)
Rank: (16)
Commander B wins (22.2 , 7.2)
Rank: (1)
(33.0 , 4.8)
Rank: (19)


As you can see, when the uncertainty is high ranks can change quickly. But if you look at the players' ratings (the first number in brackets) you will see the variability is much less pronounced.


When the outcome closely matches expectations (Vet beats newb) we can observe how little changes occur:

  • Commander A loses little rating, μ, but the confidence of the system on his rating has increased (σ has decreased) (After playing a game AllegSkill now knows more about this player).
    • Given the way the conservative rank is calculated, this ultimately results in a higher rank. This effectively replaces ELO's and HELO's newbie modifiers (the modifiers that allowed newbies to gain ranks faster than they lost them).
  • Commander B receives a boost from his victory as but since his σ is lower, the change to his rating, μ, is not as big.
    • This is controlled by the <math>\sigma^2/c</math> factor in the updating formulas.


When AllegSkill receives a surprising outcome there are much bigger variations:

  • Commander A, the (0) comm, gains a whopping 13 ranks (6 of which come from the drop in sigma).
    • This is because AllegSkill received very significant information about commander A.
  • Commander B's loses a bit of rating, μ, but that is somewhat limited by his lower σ.
    • The σ reduction (gain in certainty about accuracy of commander B's rank) is also smaller in this scenario: a loss is less significant than a win.


Related Articles


AllegSkill
About: AllegSkill · FAQ · Interim FAQ · Gaining ranks · Whore rating · more...
Technical Details: Commander's ranking · Player's ranking · Stack rating · AllegBalance