ELO

Catch-all for all development not having a specific forum.
Psychosis
Posts: 4218
Joined: Wed Oct 27, 2004 7:00 am
Location: California

Post by Psychosis »

I boot MrNuub who stops donating, i get fragged...

hell no. good idea, leave the booting to the senate to review later.
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

Psy

Other good part. Weasel licking asshat gets his own Senate style loving for being distruptive to overall game. Nice for those who drive us all nuts not by crappy game play but rather jack ass behavior. Hmmmm maybe we can get no prizes for booting the true asshats /unsure.gif" style="vertical-align:middle" emoid=":unsure:" border="0" alt="unsure.gif" />

MrChaos <---- not MrNuub /tongue.gif" style="vertical-align:middle" emoid=":P" border="0" alt="tongue.gif" />
Lykourgos
Posts: 1001
Joined: Tue Jan 11, 2005 8:00 am
Location: Portland

Post by Lykourgos »

First, booting has absolutely nothing to do with stats, and it should stay that way. There's a reason that mechanic exists, and it is a good reason. I think that the current enforcement is sufficient.

Second- is English your first language?
Last edited by Lykourgos on Wed Jul 05, 2006 4:54 pm, edited 1 time in total.
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

Lyk


Fact:
Booting was NOT added for any reason but to address the same type of Asshatery behavior that is in the current RoC. Distruptive play. Not anyone's personal opinion on what is worthwhile game play.


Opinion:
Childish booting is that which is done in a fit of anger, or out of arrogance because someone isn't doing what you personally want as the commander. RoC definitely do not include anything like this at all. Pook, Thal, and the Senate need to hear everyone's side on the matter. Let's agree to disagree and move on. Deal?


Explaining:
It may be that the discussion is a bit confusing from a technical aspect and/or I did mess up the explaination trying not to turn it into a pure math and statisics discussion. I still find it realitvely easy to follow. Other then this bit:

"(my assumption is your just that L-A-Z-Y that opening a new forum, being benevolant co-dictator, working on the core, and FAZ is all you greedy, selfish ass will give us < I $#@!ing hope he got the joke hmmm I'll add another one just to be sure > < and a j/k > j/k ) would be nice."

It reads like a program or a math equation, paranthesis matter, is meant as fun, and absolutely not germaine to the topic. Feel fit to ignore it. The rest of the matter discuss how you wish.


Promise:
Finally I completely agree that stats and booting are two seperate issues. I am not volunteering to address the matter codewise nor is it approriate for me to do this unless specifically asked to help. Rather eager Alice asked so I blame her, the bitch. Consider it removed from future posting on this particular topic and my Bad.


Adressing just the TrueSkill idea (WARNING MATH AND STATS DISCUSSION FOLLOW):
The idea is that it is a Bayesian application around a Normal (Gaussian) distribution of player skills. There is a base assumption made of each incoming daisy fresh newb which is adjusted based on there success compared to the entire player population AND the players they play against. More games aginst similar competion the less the uncertainity the higher you go.

If Allegiance was a head to head game the player could go to the top of the leader board in a very few number of games since they would only be playing tougher and tougher matches arranged via TrueSkills. The quickest possible number of games would n*log(n) where n is the number of players in a typical game played AND the player always wins AND matches are arrranged using this method. Reality means that it takes longer since the people needed to arrange the match may not be availible exactly when the uberNewb wants to play so a larger number of games will be needed (he plays against a less competitive field which causes the confidence in his ranking increase slower then a more competitve field). Their claim is that using an 8 person head to head game (racing style inwhich a clear 1-8th place outcome is guaraanteed ) it can sort out a Newbie's position in less then 10 games even including to the top 1% percentile of the game. IF "player always wins AND optimal matches are arrranged using this method".


Since Allegiance is NOT an individual game but a team game with only a team winning or losing then it takes much longer to achieve a high degree of confidence in a players level within the community. An incoming newb may need as many as 75 games to accurately rank with a HIGH degree of confidence his placement within the community. If we continue to use the Newbie until the number is lost arrangement it CAN be incorporated by assuming a skill based on his/her level and their errrr cherry getting popped. One he loses his number how we address this is open to debate. Reboot his stats or use them to more accurately arrange his ranking. The community may want time served to start once the cherry popping occurs and that would mean rebooting his stats but the method could use the results easily.

This method uses the overall success of the player rather then time in game. It also addresses the inherent weakness in using ELO for games such as Allegiance where you win as a team and most games are pick up. The "stats" made availible by this method would be ranking within the community and confidence in that placement. Tracking wins and loses is nessecary but no other stat is needed whatsoever. Things like kills, base kills, nanning or whatever are secondary only but could be used to address the pride factor of the best at the various individual skills needed to help winning. A whore who protects his bonus at the expense of the team would clearly be shown to be less of a player then someone who's kills/deaths was very poor but always did the things needed to win using "TrueSkills". Exactly what you want to know when playing a game based on team play.

Commander ranking could also be done using this method AND would include the overall skill level of his players working with him. This ranking would take longer to jell since the team indivual stats greatly effect a commanders ranking. Someone who aways wins by the stack would increase in his ranking glacier like while someone who can whip a pick up team of a large range of playeers into a win time after time would have a rank that would quick reflect his status.



A final thought:
Hmmm this method could be used to sort out the King (Queen?) of the whores pretty fast via multiplayer death matches in like vehicles. Might be a good way to do some basic testing of the theory too. That would be by forseeing the outcome of matches. Hmmm an interesting idea indeed.


MrChaos
Flower
Posts: 252
Joined: Mon Jun 26, 2006 7:00 am
Location: K-Pax

Post by Flower »

Hiya all :-)

(and praise be to the developers without whose love for variables, classes and mathematics we could never enjoy this game :-)

I am a newbie and which noted a sharp increase of my Newbie-Skill tag within a short period. (joined a game where the commander for some reason resigned after only a short time (many times in a row indeed)

After logging in again I noticed that my (0) tag suddenly had jumped to (2) and indeed one game later it even went up to (3)

After reading about the Elo settings I cam to the conclusion that for some reasons these short games must have counted towards my Elo rating. (As I get the absolute rating only as positive score)
I was told that short games should not increase my Elo rating at all. Yet I have not other explanation for my sudden pseudo-skill increase. Could it be that short games do count for newbies?

P.S.:
I would suggest changing the newbie Elo rating system as it does not seem to be a suitable indicator for the progress of a newbie. (Safe for identifying true newbies ( (0) tag)
But as far as I understand the newbie tags will be dropped in the next release.
Yet perhaps it would be wiser to assign newbies an Elo rating based upon their true-playing time. Hmm like Xpoints per hours of playing on a server. Thus the newbie tag would not be prone to such sudden and unpredictable jumps.

Well just a new player thoughts. Do not pay them heed if they strike you as old simple or stupid :-)

*wave* have a nice day!
@RT: "We've never been whores, we are misunderstood RTists."
Psychosis
Posts: 4218
Joined: Wed Oct 27, 2004 7:00 am
Location: California

Post by Psychosis »

MrChaos wrote:QUOTE (MrChaos @ Jul 4 2006, 09:19 PM) Psy

Other good part. Weasel licking asshat gets his own Senate style loving for being distruptive to overall game. Nice for those who drive us all nuts not by crappy game play but rather jack ass behavior. Hmmmm maybe we can get no prizes for booting the true asshats /unsure.gif" style="vertical-align:middle" emoid=":unsure:" border="0" alt="unsure.gif" />

MrChaos <---- not MrNuub /tongue.gif" style="vertical-align:middle" emoid=":P" border="0" alt="tongue.gif" />
i was talking about a random nuub, i know your MrChaos.

next, wtf? we already have a system in place, i believe i ate a ban for booting 2 nuubies, and so did a bunch of other people.
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

Hiya all :-)

(and praise be to the developers without whose love for variables, classes and mathematics we could never enjoy this game :-)
Le Fluer
I agree completely but want to make sure you understand I am not one of the giants you mention just someone offering to help in some small but hopefully vital way.
I am a newbie and which noted a sharp increase of my Newbie-Skill tag within a short period. (joined a game where the commander for some reason resigned after only a short time (many times in a row indeed)

After logging in again I noticed that my (0) tag suddenly had jumped to (2) and indeed one game later it even went up to (3)
Being a recently returning player I noted the exact same thing. This, my guess, has to do with using the absolute value of the ELO change (-32 to +32) so if you got the heck stacked out of you (many vets on one team, many eager but less skilled rookies on the other) and lost miserablely you get max points.

There is for lack of a better way to say it a built in fudge factor (the variable, k, used in many different disiplines to represent a grudging nod to the unknown) for ELO which is meant to address uncertanties with using this method. Helpful in some other games applying it but which gets picked by the development Titans. If they guess wrong skew city for scores.

ELO is meant for HEAD-TO-HEAD play not even multiple people competing for placement in the same game. When it's team play based on a complex number of reasons for winning wellllll it's just out of it's predictive league.

I think the ELO k for Alleg is game time played to total game time (Pook's original answer).
After reading about the Elo settings I cam to the conclusion that for some reasons these short games must have counted towards my Elo rating. (As I get the absolute rating only as positive score)
I was told that short games should not increase my Elo rating at all. Yet I have not other explanation for my sudden pseudo-skill increase. Could it be that short games do count for newbies?
It's just might be the case they are getting counted since I saw my number move too after playing under minimum players for stats games (even 1vs1).

Abs[ELO] also contributes. Your rookie number disappears at 850(?) cumulative points. That could happen PDQ depending.
P.S.:
I would suggest changing the newbie Elo rating system as it does not seem to be a suitable indicator for the progress of a newbie. (Safe for identifying true newbies ( (0) tag)
But as far as I understand the newbie tags will be dropped in the next release.
If you a getting there is to be no newbieTag from my postings then while ahhh (damn that Lyk /wink.gif" style="vertical-align:middle" emoid=";)" border="0" alt="wink.gif" /> ) check with the Biguns.

What I thought I made "clear" was this rating system supported the use of NewbieTagging ( a initial concern of mine). My comment was this: you could reset a rookies win/lose stats upon losing their tag and the ranking system could easily adjust to it. A judgment call on the higher ups part if NewbieTag stats counted. There are reasons you might want to do either
Yet perhaps it would be wiser to assign newbies an Elo rating based upon their true-playing time. Hmm like Xpoints per hours of playing on a server. Thus the newbie tag would not be prone to such sudden and unpredictable jumps.
Nooooooooooooo /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />

Newbs are the reason for a needed change. Stacking is a problem both when done intentionally by Vets or unintentionally by Rooks. Hider nicks, returning UberVets ( Viscur who single handed beat an entire team of 19 other players in a death match, without dying, in like two minutes) only add to the problem.

Autobalance is the flip side of the coin of an accurate rating method and the key to better game play.

TrueSkills addresses this "wait a bit for newbies to grow up" thought by making a win the criteria. You'll need a number of games to reach midpoint actually since the standard deviation is used to adjust everthing. Every first time player is assumed of average skill ( say 50 out of 100 levels) but they can be the absolute worst or absolute best (how the heck do we truly know ? ) using mu(50 is assumed)-3*omega(16.667 for daisy fresh rook) it allows rapid level correction of rook without benefiting stackWhores (the sWers may actually drop score). If you choose rebooting win/lose option upon NewbieTag stripping, their effects on game play ranks of others is minimized.... again.
Well just a new player thoughts. Do not pay them heed if they strike you as old simple or stupid :-)
Keep talking just keep being polite to the big boys you'll be fine. Me, no nasty flame jobs involing family and we'll be 'Cool and the Gang'. Thanks for the info.
*wave* have a nice day!
OK maybe not that nice. /laugh.gif" style="vertical-align:middle" emoid=":lol:" border="0" alt="laugh.gif" />

MrChaos

Edit:
Added for Psy.

I know you too. It was a j-o-k-e on my part.

Unintentionally guilty of overextending the topic, stirring needless feelings since an autobalance and new ranking system idea aren't going to touch on it.

Please let it drop here and for now elsewhere since here isn't place to debate it and I got a full, full plate at the moment.

Thanks Psy and Ly,
Last edited by MrChaos on Thu Jul 06, 2006 2:21 am, edited 1 time in total.
Ksero
Posts: 87
Joined: Sun Feb 15, 2004 8:00 am
Location: Sweden

Post by Ksero »

I backtracked from Baxter's thread to the old forum and found Pook's post containing the ELO calculating code (hoping it's still up to date). I'm posting this here, since it seemed too technical for the thread over in the general forum.

Code: Select all

-- BEGIN PROCESSING GAME 
INSERT INTO @TeamELO(TeamNumber, TeamPlayerCount, ELORating, Winner, Playtime, MaxPlayertime) 
     SELECT gt.GameTeamNumber, Count(p.Member_ID), SUM(((p.ELORating - 1500) * p.Modifier) + 1500) / Count(*), gt.GameTeamWinner, SUM(p.Playtime), MAX(p.Playtime) 
       FROM GameTeamMember gtm, GameTeam gt, @Players p 
      WHERE gt.GameTeamIdentID = gtm.GameTeamID 
        AND p.Member_ID = gtm.GameTeamMemberMemberID 
        AND p.TeamNumber = gt.GameTeamNumber 
        AND gt.GameID = @GameID 
        AND p.Playtime > 0 
   GROUP BY gt.GameTeamNumber, gt.GameTeamWinner, p.TeamNumber
So the team ELO is SUM( ((p.ELORating - 1500) * p.Modifier) + 1500 ) / Count(*)
If I understood correctly, p.Modifier is a float between 0.0 and 1.0 that expresses for each player how long time he or she played in the game as a fraction of the total game time. I assume Count(*) is the number of team players. Imagine a team where new people continually joined and only stayed for a short while. For all these players, their p.Modifier will be close to 0, so the team ELO can approximately be simplified to
SUM( 1500 ) / Count(*)
What I'm trying to say is that people who only join the game for a short period push the team ELO towards 1500, regardless of their ELO. I think something like this would be a more correct assessment of the teams' skills and efforts:
SUM(p.ELORating * p.Modifier)
or in pseudo-code
TeamELO = sum( [p.ELOrating * p.playTime for p in TeamPlayers] )

So if a very skilled player plays for you for a short time, he contributes approximately as much as a voob who plays for you the entire game. The number of players doesn't matter when aggregating the team's ELO.

Hmmm... It would be interesting to copy the existing database and re-calculate all elo-ratings using this algorithm... I wonder if that would be more accurate. I've tried to think of a simple way to measure how "accurate" the ELO ratings are... I haven't come up with anything conclusive, but...
At the heart of the ELO system are the predictions of how large chance a team has of winning a particular game, the "Expected Outcome". If the ELO ratings are accurate, then these estimates should be fairly accurate.
Now... Consider another situation: If you want to see if a particular dice is weighted, you can roll the dice a hundred times and see how often each face comes up. So one way to test it would be to take two teams and let them play against each other 100 times. Then compare the win-percentages to what ELO predicted. But that's not feasible. Instead of throwing the same dice many times, we're throwing a new dice each time, since every pickup game is unique.
But what if we can group similar dice... I mean games... together? For example, we could check all the games where the estimated outcome was between 75-25 and 65-35. Then calculate how many games were won by the underdogs. That would be one way to measure the accuracy of ELO. If it deviates significantly from 30, then we should become suspicious.
"Better than Light Booster 1"
Pook
Posts: 1758
Joined: Tue Aug 13, 2002 7:00 am
Location: Texas, USA

Post by Pook »

Ksero, you're correct, and it was on purpose /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />

We were experiencing issues where individual players were over-influencing the Team ELO because it was a simple sum.

Now, the Team's ELO is basically 1500 and is then adjusted higher or lower for each player based on that player's rank and time in game.

Make sense?
Image
FreeBeer
Posts: 10902
Joined: Tue Dec 27, 2005 8:00 am
Location: New Brunswick, Canada

Post by FreeBeer »

Ksero wrote:QUOTE (Ksero @ Jul 13 2006, 03:43 PM) I Instead of throwing the same dice many times, we're throwing a new dice each time,

Die. Just die, okay? I am, of course, referring to the fact that the singular of "dice" is "die". /mrgreen.gif" style="vertical-align:middle" emoid=":D" border="0" alt="mrgreen.gif" />
[img]http://www.freeallegiance.org/forums/st ... erator.gif" alt="IPB Image">

chown -R us base
Post Reply