New rating system RFP

This was news to me. Has it been discussed among our esteemed leaders? Is the idea to retain a numeric system roughly similar to the present model—100 to 3000 or so—or come up with something new?

It was discussed at the Rating Committee workshop which was held online a few days ago.

As I understand it, having provided background information on how the current programming, which I rewrote in 2004-5, works, my understanding is that there are no changes in the ratings formula being planned, just an updating of the technology it runs under. This rewrite could facilitate future changes in the formula, I know that the RC has explored ways to have a reliability number or confidence interval associated with someone’s rating.

In a perfect world, the old system and the new system would compute identical ratings. Due to things like mathematical computation libraries and data storage precision, that might be difficult to achieve, but if someone’s FP rating is 1842.3234175 on the current system and 1842.3234183 on the new system, that’s probably close enough. When we were testing the programming back in late 2004 and early 2005, we found that about 99.9% of the ratings from 2004 were identical. The reasons for the occasional differences were never tracked down.

Thanks for the info, Mike. At first, it looked like the “design and implement” part pertained to the actual rating system—the formula and its application—rather than how to maintain the system.

It’s a new system (using a different database and new programming), but as far as I know the same formulas.

One of the complications that arises with rerating is that the programming has to be able to reproduce what the ratings formulas were at various points in time in order to be able to rerate those events using the formula as it existed then. The way that Effective Games/K is computed has changed at least once, the bonus threshold has changed several times, and the way ratings are initialized using other ratings information changed a lot when we went to the blended ratings initialization process a few years ago. (And that’s probably not a complete list of all the formula changes that have occurred since the current formula structures went into effect in 2001, though we only rerate events that were initially rated on or after January 1, 2004.)

Request for Proposals | US

What does the part “more precise ratings than our current infrastructure” mean? How can we get more precise ratings without any formula changes?

No idea. One could argue that having FP ratings that use double-precision real data fields (probably 64 bits in most storage conventions) offers more precision than is necessary for any practical use of ratings in tournaments. (Elo, as I recall, argued that even having 4 digit ratings was probably more precision than he thought was needed.)

Checking with Emanuel London, Emanuel meant it to refer to the necessity for being able to match the existing ratings computations. More artful wording might be substituted if we can come up with something. (It would not surprise me if in testing we found some cases where the existing programming was in error, we had an error in the computation of online ratings for a number of years before it was noticed, and not because someone complained about incorrect ratings.)

Please don’t confuse precision and accuracy.

USCF ought to integrate into the FIDE ratings, like European countries do. That is, they don’t use their own “system”, but submit all rated games electronically to FIDE and get a FIDE rating. That allows any member of any federation to play in any country with no rating confusion. I play both in the US and Europe (mostly Norway), and wind up with two different ratings because the US is not part of the FIDE system.

There would be a bit of an issue with lower-rated players then. FIDE has a rating cutoff and does not even publish ratings for players below that cutoff. The majority of US Chess members are juniors and the majority of those would be below the cutoff, so I’d figure that more than half of the currently rated active players would not have a publishable rating.

Good point, but I still feel that it would be better for chess players and organizers if ratings were consistent around the world. I suspect the reason for FIDE’s rating floor is the logistics of rating the large mass of young new players around the world. But computing power and speed has grown so vastly in the past few years that what was difficult a few years ago may be feasible now, since all rating reports are (or should be) submitted electronically, and involve very little manpower. I think it’s at least a goal to work toward, and to be achieved as soon as it’s feasible.

It might be better for players rated 1600 or higher, but the median regular rating for the 27,435 players with an established rating on the 2022 annual list was 1233. Only 29% of those players had a rating of 1600 or higher, though 3/4 of them have a FIDE ID so presumably they have played in at least one FIDE rated event.

For the 30,761 players with a provisional rating on the 2022 annual list, the median was 487. Only 827 players had a provisional rating of 1600 or higher.

Organizers and TDs might not feel the same way you do about the advantages of using FIDE’s ratings systems.

Other issues that would need to be considered are using FIDE rules (which would make some events ineligible for rating), having FIDE-certified arbiters, and paying ratings fees to FIDE. US Chess processed over 28,000 regular rated sections in FY 2022-23, with over 680,000 games.

It isn’t clear to me how our 6 ratings systems would fit into FIDE’s ratings systems.

FIDE requires each member federation’s (few) authorized people to create the IDs and that is not a simple process. It is very different from players (or parents) creating IDs for themselves or their kids. FIDE seems to be a long way from being able to have a correction made that triggers a re-rate of an event. FIDE seems to be increasing the level of the lower limit of ratings, not lowering it, so it is going in the wrong direction of making it more usable for the (large) US scholastic population, and the rating fee for FIDE rated events is a lot higher than what US Chess charges. There is no Club-level or Local-level entry point for new TDs/arbiters (so remove 711 Local-level TDs - chief of 7,596 events and 1,589 Club-level TDs - chief of 7,615 events and that leaves only 339 higher TDs - chief of 5.246 events or less than 10% of the TDs with more than 74% of the events last year having been run by people that would not be allowed to run FIDE events, and that doesn’t even consider that 85 of the 339 Senior&higher TDs were not active in the last year).
There are likely a number of other significant points that I am not thinking of at the moment. There are quite a few things that would need to be taken care of before even thinking about switching over to using FIDE to rate all US events.
PS FIDE rules would have to be used to run the events, so TDs would be calling any flag they saw, descriptive notation would be prohibited, etc.

Funky rules on rateable time controls. Nothing short of G/60 is rateable, period, thus eliminating probably the majority of scholastic tournaments, plus most of the early rounds on 2 day schedules for major CCA tournaments. If there are players rated >1800, effectively you need G/90 minimum and G/120 for games involving anyone above 2400. (Guess what, you beat that 2500 in a G/90 section—your win doesn’t count).

It’s a rating system which was designed for professional chess players and they attempted to extend it with almost no changes to not just amateurs but relative beginners. They have now had to admit that it didn’t work.

The need to do things like call flag falls would require substantially more arbiters per player than is the case with events rated only by US Chess (as otherwise, you might be unable to detect all or even most flag falls, etc.). Especially for a large tournament, this could become very expensive.

Thanks, Michael and Jeff, for your clear explanations of the difficulties and drawbacks of local clubs dealing with the FIDE rating system. There seem to be several valid reasons for that not happening any time soon.

Mr. Doan neglects to mention that some of the “prohibited” controls are rated under the FIDE Rapid or Blitz systems.

Mr. Brown makes a common error conflating arbiter responsibility to enforce rules with arbiter needing to enforce rules. We’ve taken away the TD responsibility to correct illegal moves in many/most scholastic events because we fear accusations of TD bias, and even extend this to the point of refusing to allow TDs to intervene at the end of a game, even when asked, and I’m sure this results in some bullying to change results. Arbiters (and TDs) can’t possibly see everything that happens at every board, and in every country other than ours we accept that they do their best and that they are unbiased. It saddens me that we don’t try to do this.

The biggest reason why Mr. Lillebo’s idea isn’t possible is political and one which almost certainly doesn’t effect him. Simply put, we have a great number of players who are far, far too weak to earn a FIDE rating. For example, I just searched for a 399-rated player in his home state. This player is in the top ¾ of all active players. Over 18,000 active tournament players don’t reach this standard and almost all of them have parents who want to see how they are progressing in chess. Our rating system is very roughly aligned with that of FIDE, and players rated less than 1000 will not be rated, so parents have no method of assessment of players perhaps even three times as good as this player. This is also the reason we have the 100-150 variable floor.

So, the big reason why we don’t, can’t, and won’t is that we want to be able to show parents a measure of their children’s success.

The major reasons for having a ratings system are to be able to use it in tournaments and measure one’s progress over time, if the overwhelming majority of kids couldn’t have a published rating, why would they want to play in rated events?