Rating inflation?

Hello,

A gentleman at my local chess club who recently returned to rated play after a 40+ year hiatus believes that USCF ratings have inflated in the decades since he was last active ~1970.

I’m wondering if any old timers here can provide some perspective on whether this conjecture is correct? If rating inflation has occurred going back that far, any estimates of how much and what is the cause?

I am aware that the ratings committee takes great care to ensure that ratings remain steady over time, mainly via adjustments to the bonus point formula as needed. But any feedback would be appreciated.

Thanks!

I started playing rated chess in the late 60’s.

Personally, I think the young 1500 and 1600 players of today are more like the 1700 and 1800 players I faced back then. That’s not inflation.

I haven’t seen the Ratings Committee’s report for 2018 yet, but this is the time of year when they look at their core group of players (active players in their mid 30’s to mid 40’s, whose ratings are presumed to be stable) and decide if the system is inflating or deflating. The last few years, their opinion has been that the system was inflating slightly, due, I think, in large measure to ratings formula changes made several years ago. However, their target time period is the mid 1990’s not the 1970’s.

My personal experience is that I was rated about 1700 in the late '70s and an expert was rare. I then had about a four year hiatus when life got in the way and I wasn’t playing. When I returned to play I was quite rusty but maintained my rating. When I got rid of the rust I shot up to expert and learned that expert was much more common. Upon asking, I learned that there had been a period of “fiddle points” that inflated everybody.

This year’s National High School had, if I remember right, 25 masters in it.

The Elementary had 1 master and 9 experts, any one of whom would almost certainly have crushed me like a grape.

This is the golden era of youth chess!

I think it is difficult to compare eras so far apart. In Elo’s book he suggests that the number of players in a rating pool can affect ratings, pushing them higher over time as more players compete. Whether that has actually happened is hard to say given all of the tinkering has gone on with ratings from a statistical point of view.

However, when I compare games by Class B or Class C players from the early 1970’s to today, from a qualitative point of view, there is not much difference in the type of play, errors made, and skills demonstrated. This takes into account the use of books, computer programs, and databases that should be affecting the play. In spite of all of the books on decision-making, there seems to be little of this information being applied to improve performance.

Class players from the 1970’s period and earlier were affected by studying Reuben Fine’s works. Some became enamored of Nimzovich or studied games of world champions. Older works focused on static positions and less complicated combinational motifs. It is not clear that the work done by Kotov, Dvoretsky, Aagaard, Yusupov, Shereshevsky, Kahnemann, Csikszentmilhalj, and other on decision-making has filtered down to the masses. While higher rated players have dabbled, the use of computers and information overload has been as much confusing as helpful. Evaluation, calculation, and move selection have become more difficult today as the game becomes more dynamic and less bound by rules. Ratings and performance are guided by competitive skills having less to do with chess and more to do with sports.

Comparing games from different eras and at multiple ratings levels would be one way to try to validate the concept of using analytical engines to assess someone’s chess skill. it might make a good dissertation project.

Comparing the games require carving out a number of moves of opening “book” theory and the last few moves which are basic endgame technique. Both can be learned and memorized. That leaves you with a core of moves from the early middlegame to the transition to a simple endgame, probably around 25 moves for comparison. It is in that area that I have looked at some of the games played by Class C, Class B, and Class A players, the bulk of tournament players. So far, I do not see much difference in the play between players back in the early 1970’s and now. They still exchange pieces at the wrong times, do not see combo motifs, and miss threats. It does not seem to matter whether the time controls are slower, which was more common in the earlier time period, or faster. The impulse to move first based on an initial impression and think later is common. We haven’t gotten any smarter or better even with the information revolution. Or maybe we haven’t yet applied it to chess because it is a game.

I’m not so sure that the opening and endgame portions of those games should be excluded from analysis, as the depth of one’s opening repertoire and endgame knowledge are key aspects of chess development.

I’ve watched plenty of young players with ratings between 1000 and 1400 struggle with what most chess engines would consider ‘easily winnable’ games. How many 1200 players would be able to win a bishop-and-knight versus a lone king ending? I’m a former 1500 player and while I understand the basic concepts of that well-known endgame, I doubt I could execute it flawlessly, especially if I was short on time.

Moreover, it may be true that certain openings that were in favor in, say, the 1970’s are no longer in favor because of improvements in opening ‘book’, at levels below master strength how many players would be able to take advantage of those improvements? It seems more likely that a propensity for opening blunders needs to be included in the assessment of someone’s chess prowess.

I haven’t read the rating committee’s full report yet, but Mark tells me that no change will be made to the bonus threshold this year. (That means things aren’t getting worse, but they aren’t getting sufficiently better yet, either, I guess.)

In what way do you think things could be “better”?

Well, the EB could always reverse some of the formula changes they made that appear to be causing the recent inflationary trends. :slight_smile:

What inflationary trends?

See the last several Rating Committee Chair’s reports, especially 2016:
glicko.net/ratings/report16.txt, which says in part:

The 2014 report went into somewhat greater detail:

The bonus factor is currently 14, and will remain unchanged this year.

It is worth noting that the RC chair’s report on proposed formula changes (several of which were tested, but not the exact set of changes that the EB made) warned that these changes would likely result in ratings inflation.

Yes, but (so far) these are predictions and mild indications. And, the recent
EB mandated changes are just one of many competing influences on ratings.
Some are inflationary, some are deflationary. It doesn’t really help
to isolate ONE and say “fix it”.

The most important thing to remember is that the CURRENT (static) state of ratings
is just fine. Some ratings may be inflatING - but that’s not the same thing as saying
that ratings are inflatED.

There are many inflationary and deflationary trends. The EB probably created yet another
inflationary pressure. If so, I expect that the RC will handle this by recommending
a lowering of the Bonus Point Threshold. The time to worry will be when the RC
recommends a BPT change, and is overruled by the EB.

US Chess Ratings are, and always will be, a push/pull between mathematical accuracy
and political fiddling.

The good news is that the Bonus Point Threshold design handle provides some control.
The annual monitoring has prevented runaway trends that have (in the past) drastically
warped the entire ratings scale.

EBs giveth, and EBs taketh away - the RC now operates under conflicting EB directives:

a) keep ratings consistent with 1997
b) change the formula so that ratings are computed differently

Vision is essential, here. It is vital to remember that first and foremost, the US Chess ratings system is a marketing tool to entice new members to join, and current members to remain. All else should be secondary.

Rob Jones

And if the rating system inflates wildly due to misguided efforts by the EB and/or the delegates to make everybody “feel better” about their rating, then before long the system will lose credibility as a measuring device, and its value as a marketing tool will vanish.

Bill Smythe

If you read the ratings committee reports since 2014, you will see that the committee does not feel that ratings are ‘wildly inflating’.

The goal of the 2013 formula changes, as I understood them, was to increase K for players who felt that they weren’t gaining enough points from their best performances, especially players in the 1800-2100 range.

Under the previous, three-tier K system, players under 2100 all had the same (and highest) K. Under the current system, K has many more possible values, as one’s rating goes up, K goes down. That means that as one’s rating goes up, it becomes harder to earn points. (But also harder to lose them.)

By increasing K, those players see greater changes in their ratings. That resulted in a somewhat more inflationary environment. The tool in the ratings system to control inflation is the bonus threshold which, since 2013, has been raised from 8 to 14. The higher the bonus threshold, the fewer bonus points awarded.

One concern that I had about the 2013 changes was that it might in effect take points away from those who need them most, players who are improving rapidly and are thus significantly underrated. It’s not clear if that occurred, perhaps that will be one of the issues the Ratings Committee looks at while studying the issue of ratings confidence levels.

I believe, and please correct me if I’m wrong, that Mr. Smythe is suggesting that if we were to, say, add 200 points to everyone’s rating (so that those lifelong experts could call themselves masters, for example) it would backfire as a marketing tool.

Alex Relyea

Mr Smythe has the advantage of having witnessed history.

This has been tried. It failed. It also created a huge mess, which took nearly a decade to unwind.

Yes - the USCF Rating System is one of the best marketing tools that US Chess has. But, the point which has stood the test of time is: it’s only useful if it is accurate. To the extent that the Rating System is manipulated for short-term political or (perceived) marketing advantage, it becomes the object of derision and DETRACTS from the organization.

In my opinion, the recent changes made by the EB were ill-advised. The main reason that I don’t worry TOO much about them is that we now have an established, understood, and respected mechanism for monitoring and CORRECTING any gross deviations from accuracy and stability.

I never said they did. I am simply warning against the natural tendency to try to inflate ratings for promotional purposes.

That’s exactly what I am saying. Thank you!

Amen.

Bill Smythe