Ratings drift

I decided to start a new thread spun off from a different thread in the USCF forum. http://main.uschess.org/forums/viewtopic.php?f=5&t=17724

I wonder how much “drift” there’s been in the last,say 30 years.

I bet a 1650 player from today would easily outplay a 1650 from the early 80’s. A lot due to do the ease of learning chess, and endless free tactical problems you get on the internet.

-When I say ease of learning chess, I meant the varied forms to do so: internet, fritztrainers and other software, even Chessmaster, ect. Not to mention good old fashion coaches/teachers and books. You can even get private lessons through various chess servers like ICS or FICS.

Probably so. And yet there seems to be inflation at the top.

I suggest people read the reports of the Ratings Committee for the past decade, especially their ongoing project to ‘reflate’ ratings to 1997 levels.

See glicko.net/ratings.html

The age group 35-45 tends to have stable ratings, so the ratings for that age group is a good group to “peg” the ratings system over time.

Except that the 35-45 age group, ten years later, is now the 45-55 age group, so what are you comparing?

Bill Smythe

/SNAP

:laughing:

That means it is a ‘rolling’ group, so each year the now-46 players drop out of the group and the previously-34 players are added in.

Multiple factors evidently at work. It’s so easy to drift from “rating” to “strength” in these discussions. I think an 1800 player today is about as strong as a 1950 player from the early '70s. At the international level, there are way more players, which means the pyramid gets taller. At the domestic level, I’d guess the influx of strong players from Eastern Europe / Former Soviet Union has something to do with higher ratings at the top.

I know the shape of the ratings curve has changed because of the scholastic influx the last few years. Has the shape also changed in the place where the very strong players live?

Recently, I went through some of my games from my peak rating (2110) period in the late 80’s. I had no concepts of opening theory or strategic planning. I made random, senseless, aimless moves until something happened…and those were my wins! Sometimes, I’d accidently back into a winning endgame or receive a gift of a cheap tactical shot. Now I’m about 2020 but my understanding and opening knowledge is light-years ahead of where it was 25 years ago, mostly due to computers and the internet (and some hard work). I’d be curious to see if anyone else went through their games from a few decades ago and found the same thing.

Yup. My experience exactly.

I played a lot from about 1966 to 1975. My USCF rating peaked at about 2080, and my Northwest rating (same formula, different set of events) hung around 2130 to 2150 for several years, peaked at 2203. I had a few bad events as my interest faded, dropped into “A” after a couple of bad matches, and I took 28 years off. I was very confident that I could get the rating back up if I wanted to. But…

I’ve worked at it since I started playing again, PlayChess, Fritz, books, videos galore, even some lessons, make fewer blunders, but my USCF hangs between 1985 and 2005, my FIDE about 2020. With some exceptions, my games are generally better, my rating lower, my opponents better. Now, age related cognitive deterioration could be playing a part, but I don’t think so. Endurance is lower, of course, and I’ve discovered my performance rating is much better in game-a-day events.

What about this: somebody has a rating below 1400 and hasn’t been playing in tournaments for a while but plays chess as if their rating is supposed to be above 2100. Wouldn’t that destroy the ratings of everyone above 2100 when the so called 1400 rated player defeats them?

That’s not really ratings drift.

If he truly plays like a 2100 player, his rating will catch up with his strength fairly quickly, and step 5 of the ratings formula will somewhat reduce the number of points that his opponents drop until his rating is more in line with his strength.

There are always going to be players whose ratings do not reflect their current strength, for a variety of reasons.

Mark Glickman has been looking into other approaches, such as the Glicko2 system, which would accelerate the rate at which an apparently inaccurate rating is brought more in line with current playing strength.

When we have acceptable outside evidence of that increased strength, we can adjust the player’s rating. But at this time the only outside evidence we accept is a current FIDE rating. The USCF does not have enough statistical data to develop conversion formulas for other rating systems.

Interesting. Let’s use 1200 for my example where a good, but very under rated scholastic player ( say rated 500 ) first plays in an Open or Reserve section. I understand that he will gain a bunch of rating points, but how will his strong opponents not lose as many as they do. It seems to me that his new rating will be used in the second pass (is this step 5 ?), but if one of his opponents goes all wins or only one draw, that opponent may lose rating points, since I believe his opponents rating strength is averaged in the formula.

Due to the USCF’s multi-stage ratings formula the opponents of someone who has a very strong performance will lose fewer points, because we use the intermediate ratings from step 4 to determine the final ratings in step 5.

Consider the 1400 player. To simplify the math, let’s say it is a 4 round event and he plays 4 2100 players, drawing all four games.

A 1400 player has a K of around 35.75.

2100-1400 = 700 points ratings difference.

The expected score of a player with a 700 point ratings difference can be computed using the formula in the ratings system as 0.01747.

Over 4 games, the total expected score would be 0.06988.

The player’s estimated rating would be (2 - 0.06988) * 35.75, or around 69 points. But that doesn’t include a bonus. There would be 53 bonus points, for a total estimated ratings gain of 122 points, giving the 1400 player an intermediate rating of 1522.

We use 1522 as that player’s rating when computing the new ratings of his opponents, so that should reduce their ratings drop a little, probably a point or two. (Similarly, we use his opponents estimated new ratings when computing the new rating for our 1400 player, so his actual ratings gain might be a point or two less than 122 points.)

The attenuation that results from the USCF’s multi-stage approach is going to be small with a 700 point ratings differential, it will be more pronounced with lesser ratings differentials.

Consider a 1400 player defeating 4 1700 players. The expected performance with a 300 point ratings differential is 0.15098, so our 1400 player’s ratings gain would be (4 - 0.60392) * 35.75 or 121 points plus a bonus of around 105 points or an intermediate rating of around 1626.

The expected score of a 1700 player (K=26.04) against a 1400 player is 0.84092, but the expected score of a 1700 player against a 1626 player is 0.60491, so you can see that the 1700 players should lose about 6 fewer points when we use the intermediate rating of 1626 rather than the player’s pre-game rating of 1400.

However, this is still not about ‘ratings drift’, but moving the past few posts to a different thread seems like busywork.

Thanks for this terrific explanation. In another topic a few months back, someone found the statement that a player who wins a section cannot lose rating points. This was before the current system with variable K and Bonus points was installed. I understand that this feature was not incorporated into the current system.
Should not there be some provision for this in the current formula; or some provision to treat provisional ratings (less than 25 games played) in a different way when it causes some unfair rating results for his opponents?

The purported rule that a player who wins his event cannot lose rating points has not been part of the USCF rating system in any version of the ratings system formula that I have documentation for, and I have them back to the early 1980’s.

A rating is considered ‘provisional’ by the rating system formula, meaning we use the ‘special ratings formula’ (which is an extension of the old 400 point rule for performance ratings), only if the player has 8 or fewer games, or has won all his games in all his events, or has lost all his games in all his events.

Using 26 games as the point at which a player’s published rating is coded as established rather than provisional has nothing to do with how ratings are computed.

I do not speak for the Ratings Committee (I am their office liaison, along with Walter Brown, but not a committee member nor an I included in their internal deliberations) and the decision as to where to switch from the special formula to the regular formula was made over 10 years ago.

Can you point to a specific ‘unfair’ example? I’ve looked at hundreds of situations over the past 8 years, the ones I considered problematical enough to pass on to the committee chair for comments usually involved players who won all their games.

As I posted in #177235 on Thu Jan 14, 2010:

Chess Life, November 1981, p 6:
US Open Meetings
Ratings, Sustaining Memberships Among Issues
“In addition, it was decided that the clear, untied winner of a tournament of individual players should not lose rating points providing the tournament includes at least eight players. Anyone losing rating points in such a situation should request an adjustment from the ratings specialist in the New Windsor office.”
“These changes will be implemented by the office as soon as possible, probably before the end of the year.”

Was that a Delegate action? I think the Delegates approved the new formulas in 2000 which did not include that rule. (Whether it was ever implemented is a separate question, I don’t have detailed knowledge of what was done prior to 2003.)

IMHO it is a bad rule, one that violates the mathematical justification for having ratings.

(And this still isn’t about ‘ratings drift’.)

An interesting feature of the Glicko2 system is that the uncertainty of the opponent’s rating affects how many rating points a player will gain or lose. In the Glicko2 system, each player has not only a rating but also a measure of uncertainty (“SD”) of that rating. Players start with a fairly high SD. As they play more rated games, SD decreases. However, if they are inactive, over time, SD increases. The gain/loss of points depends on both players’ SD. If a player with a high SD (a more uncertain rating) faces an opponent with a low SD (a more certain rating), that player will gain or lose more rating points. Conversely, if a player with a low SD (a more certain rating) faces an opponent with a high SD, the player gains or loses fewer rating points. So, a long-inactive player whose playing strength has increased over the period of inactivity (hence, a high SD) will see a significant gain in his rating, while the opponents (with a low SD) will see a much smaller decrease in rating.

One of the ideas Mark Glickman and I have discussed is the idea of additional bonuses for superior performance over multiple events in a short time span. (For example, someone who plays in a major event plus one or more side events and does very well in all of them.)

How easy this would be to define and implement are potential sticking points.

I do not know if Mark has discussed this idea with the Ratings Committee.

We’ve also talked (briefly) about the issue of bonus points in long-running club events that are broken up into several smaller events for reporting purposes. When those smaller events are less than 3 rounds, bonus points are not possible. As with the previous concept, defining and implementing this could be challenging.

Personally, I would like to see a reworking of the procedures regarding players who win all their games at their first few events. This has in several cases resulted in ratings that are obviously too high based on subsequent events, and peak ratings floors can make it difficult to bring those ratings down to mathematically justifiable levels.