AllegSkill - Commander's ranking

From FreeAllegiance Wiki
Jump to: navigation, search

What follows is the simplest incarnation of the Trueskill update algorithm, as used for commander ratings. Note that in this example the teams only consist of one player each, the commander. As we said, simple example.

We've provided as much information as is sensible, and we only assume that the reader is familiar with (or able to look up) the error function (<math>\text{erf}</math>).

The update formulae

After the game is played the commanders' ranks are updated as follows (in all cases, the apostrophe refers to the updated version):

Mu, μ Sigma, σ
Winner <math>\mu'_w=\mu_w+\frac{\sigma_w^{2}}{c}\cdot V_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right)</math> <math>\sigma'_w=\sqrt{\sigma_w^{2}\left( 1-\frac{\sigma_w^{2}}{c^{2}}\cdot W_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right) \right)+\gamma ^{2}}</math>
Loser <math>\mu'_l=\mu_l-\frac{\sigma_l^{2}}{c}\cdot V_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right)</math> <math>\sigma'_l=\sqrt{\sigma_l^{2}\left( 1-\frac{\sigma_l^{2}}{c^{2}}\cdot W_\text{win}\left( \frac{\mu_w-\mu_l}{c},\frac{\varepsilon }{c} \right) \right)+\gamma ^{2}}</math>
Draws Substitute <math>\,\!V_\text{win}(t, e)</math> with <math>\,\!V_\text{draw}(t, e)</math>. Substitute <math>\,\!W_\text{win}(t, e)</math> with <math>\,\!W_\text{draw}(t, e)</math>.

Where β is constant, and γ and ε are variables:

<math>\beta = \frac{25}{6} </math> is the standard variance around performance
<math>\gamma = \frac{25}{300}</math> is the dynamics variable, which prevents sigma from ever reaching zero, which in turn determines how quickly mu can in/decrease once sigma has stabilised. If we discover that sigma-stabilised ratings are moving too slowly to reflect genuine changes in skill, we will increase gamma.
<math>\varepsilon \simeq 0.08</math> is derived empirically from the percentage of games which result in a draw, currently ~1.01%. If the draw-rate changes, we will update epsilon accordingly.

c is a variable that expresses the general uncertainty of the system:

<math>c=\sqrt{2\beta ^{2}+\sigma_w^{2}+\sigma_l^{2}}</math>

and Vwin and Wwin are TrueSkill functions based on

  1. The normal distribution (with a mean of zero and variance of one), and more precisely its probability density function.
    <math>\text{PDF}(x):=\tfrac{1}{\sqrt{2\pi}} \text{e}^{-\frac{x^2}{2}}</math>
  2. The cumulative distribution function of the normal distribution:
    <math>\text{CDF}(y):=\tfrac{1}{2}\left(1 + \text{erf}\left( \tfrac{1}{\sqrt{2}} y\right)\right)</math>.

No idea what they are? Don't worry, they are just scary maths that stats dudes use to try and represent a large group of unknowns (every tiny detail of your in-game actions) based on a small number of samples (the game outcomes).

TrueSkill functions

Developed by Microsoft, the Trueskill update functions are:

<math>V_\text{win}(t, e) := \frac{ \text{PDF}(t-e)}{\text{CDF}(t-e) }</math>

<math>W_\text{win}(t, e) := V_\text{win}(t,e) \cdot \left( V_\text{win}(t,e)+t-e\right)</math>

There are also two special versions of <math>V</math> and <math>W</math> when draws take place.

<math>V_\text{draw}(t, e):=\frac{\text{PDF}(-e-t)-\text{PDF}(e-t)}{\text{CDF}(e-t)-\text{CDF}(-e-t)}</math>

<math>W_{draw}(t,\varepsilon ):=V_{draw}^{2}(t,\varepsilon )+\frac{(\varepsilon -t)\cdot PDF(\varepsilon -t)+(\varepsilon +t)PDF(\varepsilon +t)}{CDF(\varepsilon -t)-CDF(-\varepsilon -t)}</math>

So, what are t and e used in these formulae? Well, if you know your maths, you realise they can be anything that you choose to 'pass into' the function. In our case we are letting

<math>t = \frac{\mu_w-\mu_l}{c}</math>       and       <math>e = \frac{\varepsilon }{c}</math>

The <math>V</math> and <math>W</math> functions are the core of the Trueskill system, and vary depending on whether the game resulted in a win or a draw. In both instances, positive values for <math>t</math> represent an unsurprising outcome: the winner was more skilled than the loser. Positive values result in the functions returning small values, which in turn result in small <math>\mu</math> and <math>\sigma</math> updates. The converse is also true: Negative values for <math>t</math> represent a surprising outcome, and result in large updates.

AllegSkill example

This scenario pits a newbie commander (<math>\,\!\mu_A = 25; \sigma_A = 8.333...</math>, i.e. normal rank, high uncertainty) against a slightly more experienced commander (<math>\,\!\mu_B = 32; \sigma_B = 5</math>, i.e. high rank, medium uncertainty).

Number crunching

Let's now get our favourite computer assisted algebra system and do the calculations. Let's assume the experienced commander, B, won.

<math>c = \tfrac{5}{6}\sqrt{186} \simeq 11.4</math>

<math>\mu '_{B}=32+\tfrac{0.09195636321\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 33.0</math>

<math>\sigma '_{B}=\sqrt{\tfrac{3601}{144}-\tfrac{2.758690898\sqrt{2}\left( \tfrac{0.5701294519\sqrt{2}}{\sqrt{\pi }}+0.04463650105\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 4.8</math>

<math>\mu '_{A}=25-\tfrac{0.2554343423\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 22.2</math>

<math>\sigma '_{A}=\sqrt{\tfrac{10001}{144}-\tfrac{21.28619519\sqrt{2}\left( \tfrac{0.5701294519\sqrt{2}}{\sqrt{\pi }}+0.04463650105\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 7.2</math>

Now let's run the same scenario in reverse, with commander A winning.

<math>c = \tfrac{5}{6}\sqrt{186} \simeq 11.4</math>

<math>\mu '_{A}=25+\tfrac{0.6919672626\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 32.5</math>

<math>\sigma '_{A}=\sqrt{\tfrac{10001}{144}-\tfrac{57.66393855\sqrt{2}\left( \tfrac{1.544470930\sqrt{2}}{\sqrt{\pi }}+0.04568607959\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 6.4</math>

<math>\mu '_{B}=32-\tfrac{0.2491082145\sqrt{186}\sqrt{2}}{\sqrt{\pi }} = \cdots \simeq 29.3</math>

<math>\sigma '_{B}=\sqrt{\tfrac{3601}{144}-\tfrac{7.473246435\sqrt{2}\left( \tfrac{1.544470930\sqrt{2}}{\sqrt{\pi }}+0.04568607959\sqrt{186} \right)}{\sqrt{\pi }}} = \cdots \simeq 4.6</math>


Let's compare the possible scenarios. Data is shown in the (mu, sigma) form.

Commander A
Commander B
Before the game (25.0 , 8.33)
Rank: (0)
(32.0 , 5.0)
Rank: (17)
Commander A wins (32.5 , 6.4)
Rank: (13)
(29.3 , 4.6)
Rank: (16)
Commander B wins (22.2 , 7.2)
Rank: (1)
(33.0 , 4.8)
Rank: (19)

As you can see, when the uncertainty is high, ranks can change quickly. But if you look at the players' ratings (the first number in brackets) you will see the variability is much less pronounced.

When the outcome closely matches expectations (Vet beats newb) we can observe how little changes occur:

  • Commander A loses little rating, μ, but the confidence of the system on his rating has increased (σ has decreased) (After playing a game AllegSkill now knows more about this player).
    • Given the way the conservative rank is calculated, this ultimately results in a higher rank. This effectively replaces ELO's and HELO's newbie modifiers (the modifiers that allowed newbies to gain ranks faster than they lost them).
  • Commander B receives a boost from his victory as but since his σ is lower, the change to his rating, μ, is not as big.
    • This is controlled by the <math>\sigma^2/c</math> factor in the updating formulas.

When AllegSkill receives a surprising outcome there are much bigger variations:

  • Commander A, the (0) comm, gains a whopping 13 ranks (6 of which come from the drop in sigma).
    • This is because AllegSkill received very significant information about commander A.
  • Commander B's loses a bit of rating, μ, but that is somewhat limited by his lower σ.
    • The σ reduction (gain in certainty about accuracy of commander B's rank) is also smaller in this scenario: a loss is less significant than a win.

About: AllegSkill · FAQ · Interim FAQ · Gaining ranks · Whore rating · more...
Technical Details: Commander's ranking · Player's ranking · Stack rating · AllegBalance