TrueSkill 2: online matchmaking, meta-gaming & games that feel like work
Is everyone still having fun?
Videogame matchmaking is a topic which is poorly understood, deceptively complex and utterly crucial to a good player experience. I have played online multiplayer games for over 20 years and anyone who played through these years will know that in 2007 there was a paradigm shift in online gaming with TrueSkill (coinciding with the release of Halo 3). The fact that a player can log into any popular online game and be matched with other players who are extremely close to our skill level within about a minute is close to a miracle - and one we never take much time to appreciate. Even more miraculous is that the system itself is powered by an algorithm which could be derived by a student in a graduate statistics class. TrueSkill has managed to remain in vogue and largely unchanged for almost twenty years, an incredible achievement.
TrueSkill has driven online gaming over a period where multiplayer went from a niche subsection of the already niche hobby of video gaming to the dominant moneymaker in a trillion dollar industry, but we are pushing its limits. To understand why some people don’t think it's enough, we must first understand what TrueSkill is.
[A lot of disclaimers to start this one.
1) Although I am a data scientist who works in gaming, I have never touched or seen a production matchmaking system and have no insider knowledge. My inferences about the inner workings of these games within come from things I’ve seen published by gaming companies and from playing games myself. My inference about what TrueSkill 2 is and how it works is only informed by press releases and my best guess of how I would implement it, faced with the same constraints and goals.
2) Related to the last point, I get a bit mad about TrueSkill 2 by the end of this article. I am not mad at the people developing it, as I expect I would do the exact same thing faced with the same problems and attempting to optimize the same KPIs. My anger is directed at the interplay of online gaming culture and the financial incentives present.
3) I do work in gaming but none of the opinions here have been endorsed by my current or previous employers. My feelings come from being a gamer and a data scientist, but not as a game data scientist. I would probably be implementing TrueSkill 2 right now if someone had paid me to do it (and I didn’t love my current job).]
Brief Summary: I talk about how TrueSkill 1 works. I talk about why online matchmaking is important. I discuss why people feel that we’re ready for TrueSkill 2. I describe what TrueSkill 2 could look like in practice, and what its main aims are. I whine that games feel like work now. I whine that TrueSkill 2 may inadvertently make games even more like work. I whine that metagaming culture has turned the player base into a club you’re shunned from unless you play like an automaton. I give some unrealistic suggestions of how to make a different type of online game. I finally admit that I’m probably just too old to play online video games now.
TrueSkill
For my twenty-fifth birthday party, myself and my housemates decided to host a Halo 3 party. Ten years after its release, we got four xboxes together and sixteen controllers and set up a LAN to play together.
The issue that we had, of the roughly eighteen people willing to come to this party, 2 people had formerly had Hyabusa armor (requiring considerable time and skill investment in Halo 3), six had played Halo before, and five had never played an FPS.
As a bright eyed PhD student, I used some of my government funded research time to code up a program. This would use TrueSkill to take the results of the games at the Halo party, update the player’s TrueSkills and then match them into teams. The eventual games we had were so engaging and close-fought despite such a massive range of skills that it turned me from a casual TrueSkill fan into a fervent disciple.
TrueSkill managed to make teams
We were using TrueSkill for its precise intended purpose. The concept behind TrueSkill is this:
You (with skill x_0
) are playing a deathmatch ( a game mode where the team with more kills wins) with 0 or more teammates (skills x_i
) against 1 or more teams of 1 or more players (with skills x_j ). What is the probability that your team wins?
The first big revelation the developers of TrueSkill had was this: kills are additive. If you get a player who typically gets 5 kills in a game, and get another player who gets 7, when they’re in a team together they’ll probably get roughly around 12 kills. (this is possible because kills in halo are typically solitary affairs and they are plentiful enough that you aren’t normally short of someone to kill unless you’re winning comfortably). Therefore if the TrueSkill is a rough estimate of how many kills you’ll get in the game , x_i = n_{kills,i}
then your teams sum of skills should be roughly equal to your teams kills sum_{team}(x_i) = N_{kills,team}
. And so to have a deathmatch which is close, you want N_{kills,team_1} = N_{kills,team_2}
, and therefore sum_{team_1}(TrueSkill_i)=sum_{team_2}(TrueSkill_j)
. So it’s as simple as trying to make teams have an even sum of their players' TrueSkill to get a close match.
With some fancy maths, the outcome of these games updates the TrueSkills of the individual players. Largely:
If a high rank team beats a lower rank team, not much is updated as this should be expected
The winner of an even matchup will have their TrueSkill updated somewhat
If a lower rank team beats a higher rank team, the skills change significantly because the TrueSkill is probably not right
Even mathsless, this makes sense.
TrueSkill is able to say “I don’t know”
It’s possible at this stage that chess players are upset, as I’m writing a love-letter to a method that just took ELO scores (a method to rank chess players designed in 1960) and added them together. However there was actually another revolution through TrueSkill, one that has even been picked up by some chess websites: it allowed for uncertainty around a player's skill.
The original paper for TrueSkill refers to it as a Bayesian method, which for our purposes here can be thought of as a statistical concept where we can assign probabilities to values and outcomes instead and update those probabilities as we go. For example:
Your friend gives you a coin which you are 99% sure is a normal coin, you flip it 10 times and it comes up heads 9 times.
With Bayes theorem, you can take this situation and come out with an updated probability that the coin is normal and not weighted to heads. So why is this important to TrueSkill?
One of the reasons online matchmaking is so complicated is that it has to apply to a living breathing player base with completely different levels of engagement with the game. This means that someone whose job is to play the game and someone who has never played before both have to be accounted for - and they each need to be treated equally in front of the algorithm. This doesn’t even take into account those who play specifically to try and exploit the matchmaking algorithm (...more on those people later).
Therefore the algorithm is able to go user by user and say “This player is definitely a Bronze 2 ranked player.” and “This user is somewhere between Bronze 2 and Silver 1” is extremely useful, seeing as you can account for new, returning and power players in the same way.
TrueSkill is doing surprisingly well!
My mostly uninformed guess would be that a large proportion of relatively standard online multiplayer games, are running some close derivative of TrueSkill to do their ranking and matchmaking (based on public releases from these game companies). It’s worth taking a minute to acknowledge how incredible that is (if true): this is an algorithm which uses only wins and losses to rank players and uses a shorthand to combine teams that explicitly assumes you’re in a deathmatch. It’s being used with very good results for a myriad of complicated non-deathmatch games, and has not yet changed despite how life-or-death to a video game matchmaking is.
Matchmaking is fiscally important and improvements can make millions
One of the most significant paradigm shifts of online gaming over the past 20 or so years was the move from games as a boxed product: you buy the game (sometimes in physical format!), you install it and you play it for as long as you are interested in it/as long as someone is willing to host a server for it; to games as a service, that is you start playing a game and the game tries to extract value from you while you’re playing. This started with MMOs where they managed to charge you for the box and also monthly for your account to be active (e.g. Everquest, Ultima Online, World of Warcraft) and has moved into many online games being completely free to install and play. Value is extracted via services sold to users as they are playing, for example Fortnite (cosmetics via direct sale or a pre-paid ‘battle pass’), Hearthstone/Magic The Gathering Arena (purchasable card-pack loot boxes), and Eve Online (certain mechanics and ships are only available if you pay a monthly fee).
The difference then from a business incentive side is not to aim to sell as many boxes as possible but to get as many player-hours in the game as possible. This turns matchmaking from something which is nice-to-have (but hard to fuck up too badly) to something which can actually be quantified to have an effect on a company’s bottom line, given that every user who decides to stop playing at any time costs a company money.
Matchmaking can have a huge impact on churn (the term meaning in this case the proportion of users who leave a game) and here are a few examples:
If a new player is regularly matched against (and with) higher skilled players, they will churn due to continually having their ass kicked and never really enjoying the game, or being able to even try out the mechanics. (new users are typically the most likely to churn).
Experienced users can also churn long-term due to bad matchmaking. The most satisfying games for the vast majority of players are between players of similar levels. Playing against players who are much worse than you is unsatisfying and lacks the feeling of progress (a massive driver to get a player to continue playing a game), and playing against players much better than you typically leads to a crap player experience.
A few examples from personal experience: I churned on the first game of Valorant I had (because I was playing like shit and my teammates did not hesitate to tell me); I did enjoy getting my ass kicked in 100% of the games I played in Fortnite BR when it first came out but I only really became a dedicated player of it when skill-based matchmaking (and bot lobbies) were introduced; I do not play Overwatch ranked anymore, because everytime I get a goddamn teen who, overcome by boredom, decides to queue Tank and spend the whole game emoting in spawn it takes a day off my life expectancy and takes 20 minutes off of my run on this mortal coil.
TrueSkill is starting to look like a weak link
As previously discussed, online games are no longer a niche and are instead big business, and with that comes relentless optimization. If you were to sit a serious online gamer down and ask them what frustrates them about their favorite game, often you will hear these complaints:
Smurfs: players who have artificially lowered their TrueSkill for the purpose of playing lower-ranked players and living the power fantasy of actually being good at something.
Boosters: players who have artificially increased their TrueSkill for the purpose of seeming good at a game they are less good at, for the same power fantasy of pretending to be actually good at something.
Trolls: players who will log into a ranked team game and purposefully play badly to annoy their teammates.
Uneven matchups: games where no-one has done anything untoward and everyone is playing their best, but one team is just clearly better than the other.
All of these problems are a consequence of TrueSkill’s simplicity: Trolls and Smurfs cannot be found by looking just at their Win/Loss record, for a Troll who is smart will only throw one in every ten of their games and an extra loss in every ten is not statistically significant or easily attributable to bad intentions, Smurfs and Boosters will eventually go back to their real TrueSkill after enough games (but will annoy many players in the process).
Finally, uneven matchups are becoming a much bigger problem due to an industry trend almost designed to break the fundamental assumption of TrueSkill: the rise of complex small squad games (e.g. Overwatch, Valorant, Rainbow Six Siege, MOBAs).
Returning to TrueSkill’s genesis: Team Deathmatches in Halo. Each user spawned as an identikit master chief, with exactly the same health, exactly the same guns, and exactly the same chance of picking up better guns. Each of them could single handedly take down an opponent, and in the team deathmatch the sum of your kills is your teams score. Therefore the assumption above is perfect:
TrueSkill is a rough estimate of how many kills you’ll get in the game , x_i = n_{kills,i} then your teams sum of skills should be roughly equal to your teams kills sum_{team}(x_i) = N_{kills,team}
Now, let’s compare and contrast this to the small-team ultra complex game I play: Overwatch. Each player on each 5 person team has a choice between 39 different heroes, each with different weapons, health, abilities, and user skill required to play. Each of those heroes are stronger or weaker against each other, and on top of that heroes can be synergistic with one another (hero A works well with hero B, and not with hero C). Some heroes can kill each other in one-on-one fighting but this rarely ever happens, most players are fighting in a team and their interplay is incredibly nonlinear: if your tank is bad you’re fucked, if your healers aren’t healing you’re fucked, if your healers are the best players on your team then you will likely not see the benefit. The players on your team could be significantly better than your opponents but their “main” (read: the 1 character of 39 which they’re best at) might be crap against the opponent’s main, or they may not synergize with their teammate’s main.
If this wasn't bad enough, the game isn’t even a deathmatch! Most Overwatch game modes are about area control in one way or another, and although killing people does help with that, it is by no means necessary. I have played games of Overwatch where my team has outkilled the opponent 1.5-2:1, and completely lost.
Trying to take the simplifying assumption of TrueSkill, that the skill of each team member is linearly additive is akin to saying that a fair assumption about Overwatch is that each player logging in can individually hold a piece of ground around an objective without the help of their teammates.
Overwatch is the only game I can speak semi-confidently about but of the few hours I’ve attempted to get into MOBAs like Defence of The Ancients (124 characters) or League of Legends(167 characters), the amount of mechanics and how they intermingle is so complicated that I with 20 years of gaming experience and a PhD felt like it was above my station. All this to say, the modern games which act as money printers are too complicated for the critical assumption of TrueSkill to apply.
TrueSkill 2?
There have been many contenders to the throne of TrueSkill 2. The original I remember seeing being this one in 2018, and none of them that I know of seem to have stuck so far. However there is now a new and, in my semi-informed opinion, credible replacement.
Due to the increase in cheap computing power, data availability and the profit incentive to improve player retention, a new avenue is opening up: allow a machine learning algorithm to look at how a player plays within a game, and quantify their skill that way. Given enough in-game data points (where you go, who you click on, how you move), you can get an idea of how a player plays: this user knows to stand next to this corner, that is a gold player; this user knows where the health packs are, they are at least silver (see for example Guess The Elo segments on chess (https://www.youtube.com/playlist?list=PLBRObSmbZluRiGDWMKtOTJiLy3q0zIfd7) , or Guess the rank games
https://guesstherank.org/
). It’s possible now, and the profit incentive is there. It also would fix a bunch of the above problems:
We don’t then have to try to unpick an individual’s skill from their Win/Loss record on a team, so we don’t have to worry about adding users’ TrueSkills together to get an inference. This avoids the problem of worrying about the complexity of a game, and reduces the proportion of uneven matchups.
A user will find it harder to artificially change their TrueSkill: if they play like a gold in a bronze lobby, the algorithm will be able to pick that out. If they play like a bronze in a gold lobby, the same. Goodbye smurfs and boosters.
If implemented well, I don't doubt this could ease these problems. And there’s early evidence to show that they could be implemented well. Look at, for example, the data provided by DoTA after you play a game telling you what you did wrong, and see the project I did using machine learning on rocket league data which could identify good or bad plays in a game using just the in-game positions and velocities of players.
We’re here! A set of problems costing gaming companies money may soon be able to be solved and online matchmaking will survive its toughest test to date. However, it feels like we’re losing something important in the process.
Didn’t games used to be fun?
One thing that has frustrated me for a while now is how much games have started to feel like work. If you want to play a game (that has been out for more than a week) and not be harassed by other players, you have to:
Keep on top of the patch notes to see any changes to game balance.
Consume content on those most recent patches to find out the optimal ways to play.
Play exactly the optimal way to play, choose the “best” characters, use their “best” abilities, and be in the “correct” part of the map.
Do that same thing over and over again, knowing that any deviation from this optimal play will be met with scorn from your teammates and quite possibly them leaving/kicking you.
Who wants this? I have a job already (a job which I love, for the record). Why would I willingly log on every day and go through this insane factory-line level gameplay and whittle away my free time on it.
One of the best accounts of this happening (which inspired me greatly to write this post) is Folding Ideas’ Why It’s Rude To Suck at Warcraft, and the book Leet Noobs which both account the change in gameplay of WoW from one of exploration and unstructured fun (free play) into one based on the optimization of certain goals (instrumental play). One of the most powerful quotes (from Being In The World of Warcraft (https://www.jstor.org/stable/20638698) ) referenced in the video is “I will describe a community which has taken a digital world and turned it back into a database”.
The same is true of these ultra complex online games. Overwatch is an incredible achievement of game design and balance. As referenced above, each of the 39 heroes has an individual skillset where broadly each feels like you’re playing an entirely different game. The fact that by-and-large you can log on and have a great game with most combinations of 5v5 heroes is remarkable! What happened in the interim though is that through the growth of these industries came the profit incentive to teach people to get better, and the easiest way to do so is to identify the best ways to win, i.e. the places where the game is least balanced (also known as the “meta”), and then play the meta. This is so stark that if you watch professional Overwatch teams play one another, they largely always pick the same combination of 5 characters, of 39 available! As discussed in the Warcraft video above, regrettably this optimization trickles down to the lower levels, where players either know the meta from their favourite content creators and pick it, or harass their teammates into also following the meta (which is a far-too-common mode of teaching in online games).
Worst of all, my feeling is that at the levels that most people play at, what works at professional levels is essentially irrelevant. I, and then therefore users matched with me in games, suck at Overwatch. Even on the characters I’ve played a lot I miss a bunch of my shots and often am in the wrong place. One of my teammates telling me to switch off of Mei onto Pharah because Pharah has had a 10% buff to her primary shot is irrelevant, because where I miss 50% of the shots I take with Mei, I will likely miss 80% of the shots I take with Pharah since I am less used to her. Therefore not only is this enforcement of the meta by random players annoying given that we are playing a video game and it does not matter in any real sense if we win or lose, at our level this meta is in fact a social construct and will not increase our chance of winning.
This is work! And it’s shitty work! Every time I think about playing competitive in Overwatch it takes me back to being 17 years old working at McDonalds and having my manager tell me (accurately) that the way I was making Big Macs was suboptimal and instructing me how to do it the way everyone at every McDonald’s store worldwide was doing it. At least then it made sense as it was making money for me (£4.35 per hour) and Ronald (probably much more than that). These are video games! One way or another, we are paying to play them!
TrueSkill 2 may enshrine it in code that games should be like work
The way that most machine learning methods work is to take a load of example pairs of data, an input and an output, and try to find a mapping between the input and the output. In the case of TrueSkill 2 the idea would be to take the input of a player’s actions during a game, and the output being that player’s actual rank. That way, the algorithm can see how a user is playing and identify parts of their play which will give a tell as to how good they really are.
However, a common problem with these algorithms is that they can be biased. If you give a machine learning model a dataset of cats and dogs where 50% of the dogs are wearing a party hat and 0% of the cats are wearing a party hat, if you show that model a picture of a cat with a party hat, it may very well tell you it is a dog. Similarly, if across the whole entirety of the dataset it’s shown that the new patch makes Pharah 10% more likely to win, simply the act of picking Pharah to play with could increase your TrueSkill 2.
Worse than this, it could be true that every frame you play and every input of your controller is being surveilled by this black box algorithm, and it will use anything it can to work you out. Hmm, most Gold players have watched the Flats TikTok where it shows that Torbjorn should be putting his turret on the left side of the derailed train on Route 66 instead of on top of the bridge, you put it on the bridge? Welcome to Silver buddy. Do you like jumping around and doing a little spin as Lucio when no enemies are about? That seems like Bronze level behaviour to me, Sport.
No data scientist would willingly include these biases into their model, but ML algorithms of the complexity required to represent something as elusive as player skill are generally a black box that cannot explain its reasoning and can rarely be adequately constrained to avoid these biases.
This input bias is likely to make the convergence to the meta even more pronounced and the punishment for deviating from the meta even harsher. To paraphrase Goodhart’s law, as soon as you give a gamer a score they will try to increase it. I worry that the number that we get from TrueSkill 2 is how willing you are to treat a video game like factory work.
25 years of progress in online gaming to get to a lobby of human bots
I find that there’s an inherent tension at the heart of online gaming which I have never understood. The vocal part of the online gaming body absolutely hates the idea of playing against bots but at the same time they want their teammates and opponents to play like automatons. I remember playing NHL 9-something on the N64 against a CPU and thinking that it would be too complicated to try to make a computer play like a human and somehow secretly I was playing against people. Over the course of 25 years I’ve seen the industry connect players all around the world together and then whittle them down until they play exactly like those robots again.
I read a Guardian interview with Ronnie O’Sullivan, widely regarded as one of the greatest snooker players of all time, and this quote stuck out to me:
If you could edit your past, what would you change?
I would go back to when I was 14 and keep to my own style of playing snooker. I was perfect, but I didn’t think I was, so I started playing like everybody else and created bad habits. With a little more time, I could have been the ultimate player. I look at my career and I probably got 75% out of it instead of 100%.
I truly feel that the surveillance and nudging that has been baked into game development for the past ten years, which is proposed to be set in concrete in TrueSkill 2, means we finally give up the idea of online gaming as a sport. We are not encouraging users to find new ways to win or to experiment but to do exactly as everyone else is doing, but try to do it better than them. If 14 year old Ronnie O’Sullivan could not overcome the culture of snooker playing, how will DOTA’s Ronnie O’Sullivan counteract an algorithm that simply won’t let them play against ‘good’ players until they play exactly like everyone else. At least a player of real life sports can go to a snooker hall and kick someone’s ass; TrueSkill 2 is the bodyguard at the door asking if you’ve read the patch notes yet.
Even ignoring that and focusing on people as crap at the game as me (and I am sorry if I am sounding like a broken record): it is a game, it’s not that serious, it is meant to be fun. I have played untold thousands of hours of online games and my lasting memories of those times are not the games where every user played exactly as they should and we had relatively balanced games, all of those times blur into one. The memories that stick out to me are:
In Rocket League, when a user called YOU DISGUSTING HOGS (name slightly changed to preserve their anonymity) spent the entire game ignoring the ball and expertly destroying/colliding with other cars and harassing everyone in text chat.
The Lucio (a typically highly mobile overwatch hero) I played against who would skate directly into the middle of the battle, stand dead still and aim and shoot at people.
In Team Fortress Classic, where my mate was an avid player despite having a 2000’s ThinkPad where his best option was to play as the Demo-man (a guy who throws multiple grenades at people) while using the nub, since his trackpad did not work.
Each of these players were playing in such an individual way that made me laugh so much and each of them impacted the games they played in, and therefore had a TrueSkill. If their play is unique and not represented well across the existing dataset, where do they live in TrueSkill 2? If the new ELO hell is full of people actually trying to have fun on online games in 2024 then send me to hell with them.
What alternatives do we have?
If I haven’t got it across well enough already, my opinion is that TrueSkill 2 is in fact a symptom of a wider problem across online gaming that has allowed metagaming and the work-ification of games to become the dominant (and in some cases only) way of playing. I think it is clear that a lot of people enjoy what we have at the moment and they are probably making a lot of money for the companies making these games.
I think it's also likely that there is a significant minority who feel like me, and that there could be a market for online games which actively work against meta-gamers. People who want to log on and play an online multiplayer game but don’t have the time or inclination to keep up to date with the meta, that are there to just have fun. If this is the case, I think there are some things that can be done.
1. Obscure the mechanics of the game as much as possible
In the book Leet Noobs, Mark Chen talks about how for a significant amount of time after WoW’s release, no-one knew how the Threat mechanic worked (Threat is an extremely important mechanic for Raiding). He takes us through the time since that point to the more modern day, where threat can be calculated to within float precision for every single player and is broadcast to these players raid-wide. Most players would tell you that it was more fun before World of Warcraft became a spreadsheet but for each player, in an effort to not be left behind it was either adapt or perish. This, in turn, (outlined in Battlefields of Negotation) leads to the game developer changing aspects of the game to account for this new user behaviour.
Nowadays, with good intentions, online game developers are generally open with users about how the game plays. You can log on and see the patch notes of most online games and they will tell you exactly how much more or less damage each ability does. In sum, people are able to find out within minutes of a patch being released how they should adjust their play to fit to the new game.
I think it would be awesome if we could put this genie back into the bottle. To start with it would be great to be able to hide from users as long as possible how the game works, so they are forced to engage with the game itself to learn it and not the patch notes or source code. I would start by, wherever possible, hiding the mechanics of a game and not telling users when they changed.
However, users are good at reverse engineering the mechanics even when they aren’t told how they work. They can try to find out things from the client, or even run simulations in game to work things out. People do this now, and that genie is much harder to put into the bottle. So using the exact same argument above, let's talk about:
2. Specifically make mechanics un-meta-able
It is almost impossible to make a game which is truly balanced. It is even harder to make a game which has interesting and diverse playstyles (such as the complex games discussed above) balanced. However what we can do and what is often done is to actually quantify this imbalance. Often after a patch, users will look at aggregate statistics on games played after this patch and see which playstyles (such as heroes in Overwatch or a MOBA) are winning more, and what is their “pick rate”. So we have numbers we can reverse engineer to work out how imbalanced things are.
So for example if we have 39 characters in Overwatch and the best is 5% better than the worse, why doesn’t the server roll a dice at the start of every individual game and randomly add a power-up or debuff to each hero to make them 6% more or less powerful? It is unlikely that players will have enough time within a game to work out exactly which hero is the best, so its possible they might for once just choose the one they want to play!
3. Let’s make some competitive online games which aren’t MOBAs
No matter how much the majority of people would like it to be the case, we are never going to get away from people being assholes on the internet. Online competitive games are often being designed with perfect play in mind. Then, when the general public starts playing it and gets put in a team with someone intent on trolling them, problems start.
If your game intends to have users join up with other users via online matchmaking, why not design a game which does not become a waste of time as soon as one player leaves or decides to grief their teammates. We could have a game as complex as a MOBA or Overwatch which had ten or more people per side (c.f. TF2) which would allow people to take part in interesting gameplay but not be at the mercy of each one of their teammates.
4. Let people ban one another, or give each other a time out
One of the reasons that games are so profitable is that they can get a lot of players without there needing to be a lot of money spent on each user. For example Battlebit had ~3 million players despite having a dev team of three. In tech terms this is what it means to be scalable.
This makes player moderation very unpalatable to gaming companies as typically you need a human in the loop to judge who of two or more people deserve being kicked off and for what reason seeing as communication is complex and context is important.
But here’s a really stupid idea. Why not let players decide who gets banned? The players of an online game care deeply about the quality of the game’s player base, and they are the most qualified to judge who is griefing in a given scenario. When a report comes in about a user smurfing or griefing their teammates, anonymize the in-game data and send it to four random players who weren’t in the game. If they agree that the person was being shit, ban them!
This does mean labour for the players, but you could pay them for it! Make it an optional endeavour and reward users for in-game currency if they do the job well. (It turns out that this is actually done already in CS:GO and DOTA 2)
5. Stop trying to make winning a game be entirely skill-based
This may seem counter intuitive, but I think it makes sense. Not every game can be chess, where a grandmaster will be beaten by Magnus Carlsen every time. It feels like a phenomenon specific to online video games to try and optimize them so a player feels that their chance of winning is entirely dictated by their play. Most board games or card games include an element of randomness such as card draw or dice, and all sports have a large amount of randomness due to physics and biology. The accuracy of a floating point number and low latency internet has given us dreams of making games where you are the only one to blame for your loss. But herein lies the problem: firstly it’s creating an unhealthy environment for players as they can only take the loss personally, and secondly it’s creating pressure from teammates to one another to play well and to play exactly to the meta. Add some more dice rolls to the game, randomly screw over some players. It’s character building and it makes games more fun. Wouldn't it be great to beat Magnus Carlsen at chess? I have been loving the way that Blood Bowl implements this. You can be the best coach in Blood Bowl history and a newbie can murder your star player on the first dice roll of the game: it may not feel fair but it’s a much better story than “I was a better player in the game and I then won”.
Skill Issue?
It’s possible that I’m just not built to play online video games anymore. I’m 32, my body is falling apart and I’ve started clicking on news stories about interest rate changes. Perhaps being young and having no responsibilities is where you have to be in life to be able to enjoy this culture of playing games as work. But I can’t give up the idea that it didn't used to be like this. I used to log on to Call of Duty 2 “CARENTAN S&D 24/7-EU” server and get a 0.175 kill-to-death ratio and no-one ever seemed to mind, least of all me. People were horrendous in text chat for as long as I can remember, but the focus was much less on forcing everyone to follow the meta and more on calling me a dickhead for having a colorful name or a myriad of other personal jibes not fit for recollection here. I once played a EU vs US game on a Half Life 1 mod where 10 of us got together and did a home and away match where on each leg one of the two teams had 100-250 ping. We were free then. The data on every one of my missed shots was immediately lost to the heat of someone’s CPU. The future we’re looking at is having the data on our missed shots stored and attached to a number next to our name until either a GDPR request or massive solar flare wipes it out. If we go the way it seems we may, the data on every overextension I have as Mei will exist on this planet longer than I will, and the sum total of that will be Cardboard V.
If you made it this far, please let me know what you think! Thankyou to A, C, C, M, P, R, R & Z for the helpful feedback on the mess that was the first draft of this.
I think another big big hole in modern competitive gaming is the lack of a "pickup" game that you can wander into when the competitive scene is too trying. As you mentioned, self-hosted servers with silly rules and communities of people who know each other used to fill that role on TF2, the early CoDs, etc.
But for me a bigger missing component of modern games is mods. Mods, being disavowed from the core game, carry no responsibility for perfect balance or matchmaking and don't give players the same high stakes incentive for maximizing ELO. They also let the modders design things that don't have to accelerate a profit loop, and therefore expand design potential.
Most of the games in your article used to be mods themselves, and it made me think of how, irl, the stoners and fuckups at ultimate frisbee games that got too intense would wander off to play spikeball, which is now a proper sport unto itself. Maybe a modding community -- or, abstractly, constant goofy offshoots of formalized rules -- is a sort of necessary component of a healthy competitive ecosystem.