Are there any rules of thumb, or formulae for converting online (US Chess, LiChess, Chess.com) to US Chess Standard OTB ratings that provide a reasonable ballpark result? I’m partially concerned about stale OTB ratings for pairings, but also for (local) invitational or other local/regional “ranking-type” situations.
A stale rating is always tricky, has the player been playing (and improving) outside of that ratings system and thus is grossly underrated in it? The answer is probably more often ‘yes’ than ‘no’, especially with players under 40.
Sometimes I think any rating that was earned before the age of 12 should be thrown out completely if it is more than a few years out of date.
The new blended ratings estimation formula, which is waiting for EB approval on implementation timing and a few other administrative matters, downweights a stale rating versus other more current ratings in other ratings system. But if it is the only information available, there’s not much that can be done to make it more current.
I’m not aware of any formal studies comparing US Chess OTB vs US Chess Online ratings, much less studies between US Chess ratings and other (non US Chess) online ratings servers. (The main challenge is making sure you can match up players between the systems, and then you have to make sure your sample isn’t biased in some way.)
This is not a formal study, but here’s a scattergram showing about 2800 players with established OTB regular and Online Regular ratings and published ratings since 1/1/2019. A positive difference means the Online Regular rating is higher than the OTB regular rating.
Online Quick and Online Blitz ratings will tend to increase somewhat when the upcoming change in ratings formulas (and correction to K) takes effect, Online Regular ratings will (on average) not change much.
+1
Of course, some will see a ratings drop and complain, and others might see an unexpected ratings gain and complain.
You overlooked the ones that will complain because there was not a change.
It is also worth noting that while the new blended ratings initialization procedures do address stale ratings somewhat, there are no ongoing procedures to deal with staleness once someone has a published rating.
Suggestions that ratings that are sufficiently stale be reinitialized somehow (from other ratings data if we have it) have never been popular.
Maybe such suggestions would prove a little less unpopular if any such reinitialization were to be regarded as “temporary”, to be replaced later, if and when the reinitialized ratings become less stale through new activity.
Bill Smythe
Bill, I have no idea what that means!
Neither do I. It’s one of those ideas that requires further definition, but any attempt at further definition would surely expose the utter absence of “idea” in the idea.
Bill Smythe
Aside from the inherent resistance in having anything other than OTB-regular events impact OTB-regular ratings (which will happen in the blended initialization formula), the idea of using re-initialization to deal with staleness raises multiple questions. This list may not be comprehensive.
How stale does a rating have to be before we do something about it? Should this be age-based, ie, a different definition of ‘sufficiently stale’ for a 12 year old vs a 20 year old or a 40 year old?
Do we wait until someone with a stale X rating (any of the 6 types) plays in another X event to do the re-initialization?
Do we look prospectively for players where their rating in system X is stale and their rating in system Y (or possibly multiple systems) is significantly different (most likely higher, but do we ONLY do it for higher non-stale ratings?) than their stale X rating and go ahead and make an adjustment to their current X rating?
How do we show that someone’s rating changed as a result of staleness-based re-initialization rather than playing or rerating?
We could just use staleness to affect the k-factor and only adjust the ratings based on actual play. How close is that to Glicko-2? There may be some sorta-recent activity requirements (last two years?) for class prize or section eligibility.
I can’t answer the question about Glicko-2.
Increasing K for stale ratings would tend to accelerate gains (but also accelerate a drop), but it seems short-sighted to ignore other information.
Suppose an 18 year old has a 6 year old established OTB regular rating of 1200, a six month old FIDE rating of 1800 and a current online regular rating of 1950. His K is around 46.7.
He goes 4-0 against a bunch of 1500 players against whom he has an expected performance of about 0.6.
The ratings estimator says his new rating is 1490 and that 131 of that gain is bonus points. Even doubling K would probably only increase the new rating to about 1650. (I haven’t tried to calculate how that would affect bonus points.) But that’s still far below other data we have available, so why shouldn’t we utilize it?