Rating inflation?

I don’t think there have been any plans or proposals over the last 7 years to simply add or subtract ratings points in bulk.

I asked Ken what he meant by recent as I was not aware there had been recent changes in the ratings formula. He has told me it was 2013.

I believe that the intent of the current rating system is to get you to your current strength as quickly to prevent deflation?

I have also seen 2300-2400 players return from the 70’s and settle around 2150. Age, style, ability to change?? Also, I believe it amount of info avail

To quote from Arpad Elo’s book:

Individuals with not very many games or whose strength is changing (usually improving) will be less reliably ranked. There’s not much that can be done for a player with only a handful of games, statistical theory is based on having a sufficiently large sample, with a small sample reliability is not assured. For players whose strength is changing (usually improving), there are factors in the current system that can move those players towards more reliable ratings, eg, the bonus factor and the factors that contribute to a player’s K.

My personal observation after having worked with the US Chess ratings system for about 15 years, is that it takes about 25-30 games for someone’s rating to catch up to their current strength. The challenge with rapidly improving players is that their strength keeps changing, so you’re chasing a moving target.

The changes that were made to the ratings formula back in 2013 appeared to result in overshooting someone’s new strength (which would result in over-rating an unusually strong performance for established players), which would be somewhat inflationary. Lowering the bonus factor appears to be bringing that under control.

I’ve told this story before, but some years back there was a young player who was clearly about 1300 strength in December, based on his play in one of my tournaments. But by late February he was drawing and occasionally defeating A players and experts, though his rating did not yet reflect that. He went on to finish in the top 10 at the National Elementary in April. By December he had an expert rating and the following spring he was one of the six competitors in the Nebraska Closed Championship. (I don’t think he won that year, though I think he did the following year, when the average rating in the event was around 2176.)

Using the ratings calculator seems to indicate that K is also related to the length of the tournament (going down as the event gets longer). Thus the K is lower in a good 9-round tournament including a round one win over a player more than 650 points below you, versus the same tournament with only eight rateable rounds after getting a forfeit win over the same opponent. The 9-0 result can give less of a rating boost than the 8-0 result that doesn’t include a forfeit win. Since you cannot rate a game for only one of the players it doesn’t look like there is any legitimate change to make to the process.

Also, it would not be a common occurrence where a tournament is long enough and the K change is large enough so that the difference would be particularly noticeable. Even using an 8-0 result (instead of 9-0) would only raise a rating by maybe as much as 8 points (before any bonus, so 16 afterwards) and those type of results would probably happen much less than merely very rarely.

K is based on both effective games and the number of completed games in the current event.

While K goes down as the number of completed games goes up, the potential for a ratings change often goes up because the maximum actual score goes up more than the expected score.

When the K drops from 21 (8 games assuming a forfeit win) to 20 (9 games with an extra win versus an extra expected score 0.975 in that extra win) you end up with a noticeably smaller gain from the 0.025 you exceeded the expected score than from the extra 1 point of K you would have been given over the remaining 8 games if that first game had been a forfeit win.

This would be an unusual situation and not something you’d normally have to consider worrying about. On top of that, if you do consider worrying about it then one of the likely ways to try to handle it (skipping that outlier game for that person) would run into a brick wall of requiring both players in a game to have a rated result from that game (and I don’t think we even want to begin considering a change in that).

It is, interestingly enough, an outlier that helps reduce rating inflation from unusually good results.

In order to have an expected winning percentage of .975, you need to be rated about 650 points higher than your opponent. I wouldn’t expect such a win to improve your rating much under any circumstances, so it doesn’t particularly bother me that the gain from actual - expected is less than the drop caused by the decrease in K.

I prefer a Special K. It’s great for breakfast. :laughing: :unamused:

And so has the idea of crashing ratings to satisfy the statistical purists, NO pristine “accuracy” should NOT be the key goal. For in the pursuit of this
hundreds of seniors quit when their ratings were slashed. I too, have the advantage of witnessing history. The point is “balance” is the key.

Rob Jones

Rob,
Balance is the goal of the rating system. The method for measuring inflation/deflation is explained in the Delegate’s Call. Any recent changes are very small.
Mike

The system has been inflating a bit since the 2013 formula changes, which is why the bonus factor has been increased several times since then.

Numbers and data cannot have a goal. People establish goals based on an itinerary and their personal beliefs.

Rob Jones