Ratings drift

jjamesge1 · September 19, 2012, 1:25am

I decided to start a new thread spun off from a different thread in the USCF forum. http://main.uschess.org/forums/viewtopic.php?f=5&t=17724

kbachler:

jjamesge1:

I might be more accurate if a computer program was designed explicitly to give a rating. Not just a chess engine set at various strengths, but also tactical and other problems.

And the good thing about using a computer program is that it can adjust in real time by giving a harder or easier problems based on the results of the last problem.

I should also note that there is rating scale drift.

We - players - dislike ratings inflation/deflation. We want a player who is 1600 today to mean 1600 tomorrow. However, there are factors that work against this.

As a natural outcome of the rating system, improving players tend to decrease the ratings of the players “around” them, even though those players may not have decreased in strength. This is rating deflation.

As a natural outcome of the rating system, declining players tend to increase the ratings of the players “around” them, even though those players may not have increased in strength. This is rating inflation.

Deflation is more common in a pool with a large number of young improving players. So, ratings management incorporates tools to fight against deflation. If these tools overcompensate, inflation occurs.

Ratings are relative. We want a 1600 today to mean the same relative performance as it did yesterday. However, as chess knowledge improves, the pool as a whole improves. Thus, if someone is stationary (like a computer program) they can fall behind.

These factors add to the difficulty of a computer generating a rating.

I wonder how much “drift” there’s been in the last,say 30 years.

I bet a 1650 player from today would easily outplay a 1650 from the early 80’s. A lot due to do the ease of learning chess, and endless free tactical problems you get on the internet.

-When I say ease of learning chess, I meant the varied forms to do so: internet, fritztrainers and other software, even Chessmaster, ect. Not to mention good old fashion coaches/teachers and books. You can even get private lessons through various chess servers like ICS or FICS.

billbrock · September 19, 2012, 1:36am

Probably so. And yet there seems to be inflation at the top.

nolan · September 19, 2012, 6:04am

I suggest people read the reports of the Ratings Committee for the past decade, especially their ongoing project to ‘reflate’ ratings to 1997 levels.

See glicko.net/ratings.html

kevin_bachler · September 19, 2012, 11:16am

2012 Ratings Committee Report:

Each year the RC performs a set of diagnostic analyses to monitor trends in the rating pool. Overall rating levels have deflated from the mid-1990s through 2000 when rating floors were decreased by 100 points without a counteracting inflationary mechanism. With the new rating system implemented in 2001, ratings started to re-inflate. The RC has the goal of restoring rating levels back roughly to where they were at the end of 1997. The focus of RC work has been on players with established ratings who have been active over the current and previous three years and who are aged 35-45 years old in the current year. Based on the continued decline in ratings for this group, the RC recommended four years ago lowering the bonus point threshold from B=10 to B=6 to accelerate the re-inflation of the rating pool. This change was applied retroactively to the beginning of 2008. Over the past three years, the average rating for this group has increased by about 20 points in 2009, only about 3 points in 2010, and by 16 points in 2011. The average rating is currently about 23 points lower than the average rating at the end of 1997. Based on the continued increase in ratings for this group, the RC would recommend increasing the bonus threshold to B=8 to slow down the rating increases, and continuing to monitor rating movement in this subset of players, but our recommendation depends on the EB’s reactions to the K-factor analyses described at the start of this report

The age group 35-45 tends to have stable ratings, so the ratings for that age group is a good group to “peg” the ratings system over time.

Smythe_Dakota · September 19, 2012, 11:29am

Except that the 35-45 age group, ten years later, is now the 45-55 age group, so what are you comparing?

Bill Smythe

jjamesge1 · September 19, 2012, 12:42pm

/SNAP

nolan · September 19, 2012, 1:30pm

That means it is a ‘rolling’ group, so each year the now-46 players drop out of the group and the previously-34 players are added in.

Michael_A_Murray · September 19, 2012, 1:55pm

Multiple factors evidently at work. It’s so easy to drift from “rating” to “strength” in these discussions. I think an 1800 player today is about as strong as a 1950 player from the early '70s. At the international level, there are way more players, which means the pyramid gets taller. At the domestic level, I’d guess the influx of strong players from Eastern Europe / Former Soviet Union has something to do with higher ratings at the top.

I know the shape of the ratings curve has changed because of the scholastic influx the last few years. Has the shape also changed in the place where the very strong players live?

Flying_Rook · September 20, 2012, 10:49pm

Recently, I went through some of my games from my peak rating (2110) period in the late 80’s. I had no concepts of opening theory or strategic planning. I made random, senseless, aimless moves until something happened…and those were my wins! Sometimes, I’d accidently back into a winning endgame or receive a gift of a cheap tactical shot. Now I’m about 2020 but my understanding and opening knowledge is light-years ahead of where it was 25 years ago, mostly due to computers and the internet (and some hard work). I’d be curious to see if anyone else went through their games from a few decades ago and found the same thing.

Michael_A_Murray · September 20, 2012, 11:39pm

Yup. My experience exactly.

I played a lot from about 1966 to 1975. My USCF rating peaked at about 2080, and my Northwest rating (same formula, different set of events) hung around 2130 to 2150 for several years, peaked at 2203. I had a few bad events as my interest faded, dropped into “A” after a couple of bad matches, and I took 28 years off. I was very confident that I could get the rating back up if I wanted to. But…

I’ve worked at it since I started playing again, PlayChess, Fritz, books, videos galore, even some lessons, make fewer blunders, but my USCF hangs between 1985 and 2005, my FIDE about 2020. With some exceptions, my games are generally better, my rating lower, my opponents better. Now, age related cognitive deterioration could be playing a part, but I don’t think so. Endurance is lower, of course, and I’ve discovered my performance rating is much better in game-a-day events.

Jion_Wansu · September 25, 2012, 2:11am

What about this: somebody has a rating below 1400 and hasn’t been playing in tournaments for a while but plays chess as if their rating is supposed to be above 2100. Wouldn’t that destroy the ratings of everyone above 2100 when the so called 1400 rated player defeats them?

nolan · September 25, 2012, 3:39am

That’s not really ratings drift.

If he truly plays like a 2100 player, his rating will catch up with his strength fairly quickly, and step 5 of the ratings formula will somewhat reduce the number of points that his opponents drop until his rating is more in line with his strength.

There are always going to be players whose ratings do not reflect their current strength, for a variety of reasons.

Mark Glickman has been looking into other approaches, such as the Glicko2 system, which would accelerate the rate at which an apparently inaccurate rating is brought more in line with current playing strength.

When we have acceptable outside evidence of that increased strength, we can adjust the player’s rating. But at this time the only outside evidence we accept is a current FIDE rating. The USCF does not have enough statistical data to develop conversion formulas for other rating systems.

CoachBob · September 25, 2012, 12:13pm

Interesting. Let’s use 1200 for my example where a good, but very under rated scholastic player ( say rated 500 ) first plays in an Open or Reserve section. I understand that he will gain a bunch of rating points, but how will his strong opponents not lose as many as they do. It seems to me that his new rating will be used in the second pass (is this step 5 ?), but if one of his opponents goes all wins or only one draw, that opponent may lose rating points, since I believe his opponents rating strength is averaged in the formula.

nolan · September 25, 2012, 2:48pm

Due to the USCF’s multi-stage ratings formula the opponents of someone who has a very strong performance will lose fewer points, because we use the intermediate ratings from step 4 to determine the final ratings in step 5.

Consider the 1400 player. To simplify the math, let’s say it is a 4 round event and he plays 4 2100 players, drawing all four games.

A 1400 player has a K of around 35.75.

2100-1400 = 700 points ratings difference.

The expected score of a player with a 700 point ratings difference can be computed using the formula in the ratings system as 0.01747.

Over 4 games, the total expected score would be 0.06988.

The player’s estimated rating would be (2 - 0.06988) * 35.75, or around 69 points. But that doesn’t include a bonus. There would be 53 bonus points, for a total estimated ratings gain of 122 points, giving the 1400 player an intermediate rating of 1522.

We use 1522 as that player’s rating when computing the new ratings of his opponents, so that should reduce their ratings drop a little, probably a point or two. (Similarly, we use his opponents estimated new ratings when computing the new rating for our 1400 player, so his actual ratings gain might be a point or two less than 122 points.)

The attenuation that results from the USCF’s multi-stage approach is going to be small with a 700 point ratings differential, it will be more pronounced with lesser ratings differentials.

Consider a 1400 player defeating 4 1700 players. The expected performance with a 300 point ratings differential is 0.15098, so our 1400 player’s ratings gain would be (4 - 0.60392) * 35.75 or 121 points plus a bonus of around 105 points or an intermediate rating of around 1626.

The expected score of a 1700 player (K=26.04) against a 1400 player is 0.84092, but the expected score of a 1700 player against a 1626 player is 0.60491, so you can see that the 1700 players should lose about 6 fewer points when we use the intermediate rating of 1626 rather than the player’s pre-game rating of 1400.

However, this is still not about ‘ratings drift’, but moving the past few posts to a different thread seems like busywork.

CoachBob · September 25, 2012, 3:21pm

Thanks for this terrific explanation. In another topic a few months back, someone found the statement that a player who wins a section cannot lose rating points. This was before the current system with variable K and Bonus points was installed. I understand that this feature was not incorporated into the current system.
Should not there be some provision for this in the current formula; or some provision to treat provisional ratings (less than 25 games played) in a different way when it causes some unfair rating results for his opponents?

nolan · September 25, 2012, 3:33pm

The purported rule that a player who wins his event cannot lose rating points has not been part of the USCF rating system in any version of the ratings system formula that I have documentation for, and I have them back to the early 1980’s.

A rating is considered ‘provisional’ by the rating system formula, meaning we use the ‘special ratings formula’ (which is an extension of the old 400 point rule for performance ratings), only if the player has 8 or fewer games, or has won all his games in all his events, or has lost all his games in all his events.

Using 26 games as the point at which a player’s published rating is coded as established rather than provisional has nothing to do with how ratings are computed.

I do not speak for the Ratings Committee (I am their office liaison, along with Walter Brown, but not a committee member nor an I included in their internal deliberations) and the decision as to where to switch from the special formula to the regular formula was made over 10 years ago.

Can you point to a specific ‘unfair’ example? I’ve looked at hundreds of situations over the past 8 years, the ones I considered problematical enough to pass on to the committee chair for comments usually involved players who won all their games.

martinak · September 25, 2012, 3:43pm

As I posted in #177235 on Thu Jan 14, 2010:

Chess Life, November 1981, p 6:
US Open Meetings
Ratings, Sustaining Memberships Among Issues
“In addition, it was decided that the clear, untied winner of a tournament of individual players should not lose rating points providing the tournament includes at least eight players. Anyone losing rating points in such a situation should request an adjustment from the ratings specialist in the New Windsor office.”
“These changes will be implemented by the office as soon as possible, probably before the end of the year.”

nolan · September 25, 2012, 3:48pm

Was that a Delegate action? I think the Delegates approved the new formulas in 2000 which did not include that rule. (Whether it was ever implemented is a separate question, I don’t have detailed knowledge of what was done prior to 2003.)

IMHO it is a bad rule, one that violates the mathematical justification for having ratings.

(And this still isn’t about ‘ratings drift’.)

wilecoyote · September 25, 2012, 4:18pm

An interesting feature of the Glicko2 system is that the uncertainty of the opponent’s rating affects how many rating points a player will gain or lose. In the Glicko2 system, each player has not only a rating but also a measure of uncertainty (“SD”) of that rating. Players start with a fairly high SD. As they play more rated games, SD decreases. However, if they are inactive, over time, SD increases. The gain/loss of points depends on both players’ SD. If a player with a high SD (a more uncertain rating) faces an opponent with a low SD (a more certain rating), that player will gain or lose more rating points. Conversely, if a player with a low SD (a more certain rating) faces an opponent with a high SD, the player gains or loses fewer rating points. So, a long-inactive player whose playing strength has increased over the period of inactivity (hence, a high SD) will see a significant gain in his rating, while the opponents (with a low SD) will see a much smaller decrease in rating.

nolan · September 25, 2012, 4:35pm

One of the ideas Mark Glickman and I have discussed is the idea of additional bonuses for superior performance over multiple events in a short time span. (For example, someone who plays in a major event plus one or more side events and does very well in all of them.)

How easy this would be to define and implement are potential sticking points.

I do not know if Mark has discussed this idea with the Ratings Committee.

We’ve also talked (briefly) about the issue of bonus points in long-running club events that are broken up into several smaller events for reporting purposes. When those smaller events are less than 3 rounds, bonus points are not possible. As with the previous concept, defining and implementing this could be challenging.

Personally, I would like to see a reworking of the procedures regarding players who win all their games at their first few events. This has in several cases resulted in ratings that are obviously too high based on subsequent events, and peak ratings floors can make it difficult to bring those ratings down to mathematically justifiable levels.

Topic		Replies	Views
Rating deflation & the New Rating Formula Running Chess Tournaments	5	349	December 23, 2004
Fix the quick rating system NOW! Running Chess Tournaments	10	361	May 13, 2011
Rating inflation? Running Chess Tournaments	32	2673	October 10, 2018
deflation? 16 vs 10 for an equal win? Running Chess Tournaments	20	1196	July 30, 2005
Onward Ho.. progress in learning All Things Chess	5	167	January 14, 2015

Ratings drift

Related topics