Okay, so your chart shows how he performed for the entire time period of 2007 (66 games).
1st column: was rated lower than his opponent, the same or above
2nd column: wins
3rd column: draws ?
4th column: losses ?
Yep, no wins vs. people significantly higher-rated than himself.
10 losses to people lower-rated.
6 losses to people rated higher.
Those are statistically pretty close.
19 wins over people lower-rated.
2 wins over people rated higher.
That’s pretty close too.
1 draw to those lower-rated.
21 draws with those higher-rated.
There is one big difference in performance!
Did his rating go up something like 100 pts. over that year? I’d guess those draws with higher-rated opponents indicates some skill which is emerging, but not producing wins (yet). Is it reflected in his new rating?
I still say one year is a time period too long to be considering. you should look at one tournament after another and see if his rating going into an event is fairly close to how he actually did. The quickest way to look at this I think would be to note those people who had bonus-level performances. I don’t know the criteria for that, but it had to be something pretty good. Are there many people, perhaps especially young kids, achieving those? Is their bonus bringing their ratings up quickly (within a few tournaments) to their performance level?
If we’re only going to consider long time periods, then you’re going to completely miss the psychological effect this system has on players.
For games played during 2007, Robert Hess’s rating was in the range of 2441 to 2487, a fairly tight range of 46 points. (By comparison, Jay Bonin, the most active player in the USCF, had a rating during 2007 which ranged from a low of 2352 to a high of 2445, or 93 points.)
Most players will have good days and bad days, so looking at one event and deciding someone is underrated from that is like looking at the NFL for just one weekend and deciding who gets to the Super Bowl.
It does appear to me that he has a high proportion of draws, even against players rated above him, whether that’s true of other players in his rating range would be worth studying. Maybe that’s what separates senior masters from lower rated players?
But with no wins against players rated 100 or more points above him and no losses to players rated 250 or more points below him, I find it difficult to make a case that he was significantly underrated during 2007.
Mark Glickman (Ratings Chair) now has a data set with all games between established regular rated players that were rated during 2007, which is the data set from which I drew that most recent analysis. The Ratings Committee will be using that data to test expected vs actual performance, among other things.
I assume whatever conclusions they come up with from the data will be part of the Rating Committee’s annual report.
You say, “during 2007” as if that were one chunk of time with nothing happening in the middle. From my perspective as a player I’d like to know the rating is pretty accurate from tournament to tournament.
Fantastic. I look forward to seeing what they come up with.
You made a suggestion that grossly underrated players may be causing adult players to play less. Then you cited some players. Mike provided some data that seemed to question whether the players really were grossly underrated. When I saw his first example I was wondering how the results were distributed in time (taking an entire two years in one lump may not show the improvement over the time span).
You then earlier asked “Analyze each six month period (or tournament by tournament) during those two years and let’s see if his rating at the beginning of the next six month period (tournament) was close to how he performed during that period.” I wanted to know the same thing so I was glad to see that you asked for that.
Then when Mike did so (by using each individual tournament’s pre-event rating for that tournament’s comparison) you said “From my perspective as a player I’d like to know the rating is pretty accurate from tournament to tournament.” I’d have to say that using the pre-event rating for each tournament is an accurate way of checking for its accuracy from tournament to tournament.
Now you are asking whether or not adult membership has been going down. It seems to be circular reasoning to state that grossly underrated players are causing the membership to go down and then cite lower membership numbers as proof of the stated underlying reason.
The rating formulas have been modified in the past after analysis, and they probably will be in the future. However, I’d rather see experts in the field analyze it before making a modification rather than see a return of what many considered politically-mandated fiddle points which many people felt resulted in the watering down of rating levels (I took a three-year hiatus in the '70s with no playing or studying and actually maintained my rating in post-return tournaments even though I knew how rusty I was, then I found out that many other people with previously stable ratings had all gone up 100-250 points - once I shook the rust off I shot up 250 points myself).
It might be interesting to focus a bit on two players who did well in the recently completed Tulsa US Open qualifier: Sam Shankland and John D. Bick. Both are 22xx rated, but have very different recent trends. Shankland has risen quite a lot in the last year and Bick is down from a FIDE 2295 high of several years ago. How have their ratings kept up (or down) with their tournament performances of the last 8 years?
I know these aren’t entirely typical, but sometimes it is the outliers where we see the ‘rule’.
Bick is also a lot less active than Shankland, having played just 158 rated games in 2004-2007 compared to 568 for Shankland, perhaps because he’s about 11 years older (according to FIDE) and probably trying to earn a living (and probably not at chess) while Shankland is still in high school and being supported by his parents?
Does this mean you now agree that the data shows that Hess is not underrated, so you’re looking to change the subject?
158 rated games in 4 years is about 40 per year. Geez, for me that would be very active. I can’t hardly miagine 568 games. Sigh.
Re: Hess
You keep missing the point entirely. It isn’t whether he’s underrated right now, it’s whether he’s rated accurately at each point along the way. What will his rating be going into his next tournament? Will it indicate to people just how strong he is?
Re: what’s a rating to do
I’ve seen it argued long ago that the rating only reflects past performance, and to a large extent that has to be true. But, show me a person who doesn’t think it should also indicate how well a person is going to do in the near future. Add the Glicko factor for time away from the game and you get more flexibility. These kids are playing quite a lot, and you’d think their ratings will be indicative of near-term performances. Are they?
The article Elizabeth Vicary wrote (front page) on upcoming scholastic events references the same issue: kids playing in non-rated events and getting better before getting to the big national event. We can’t adjust their ratings based on that, but the system should surely catch their rating up to their current strength as quickly as possible.
Re: wild fluctuations for everyone
As for not wanting wild fluctuations which would affect other players ratings, there is no serious reason a game has to be zero-sum effect on their ratings. So, one player’s major gain doesn’t have to be another player’s major loss.
Re: this discussion
As for combativeness in this discussion. I don’t get it. I thought we were all on the same side in trying to make the system work better for everyone.
You have yet to make a case that the system is sufficiently broken to warrant changing. (I believe the Ratings Committee is currently studying some of the issues you have attempted to raise, but they’re doing it by looking at all the data for the last 4 years rather than hunting for specific cases that don’t pan out when the data is analyzed.)
Also, the USCF rating system has not been a ‘zero sum game’ for quite a few years (probably since the introduction of bonus points back in the 70’s or 80’s) and is even less of one since the introduction of the sliding K factor in around 2001.
Here are some ratings changes from the ratings estimator on the website, but I’ve found it is usually accurate to within a point for established ratings:
An 1800 player loses to a 1600 player (that’s the only game in that section involving either player.) The 1800 player’s rating drops to 1781, a loss of 19 points. The 1600 player’s rating goes up to 1625, a gain of 25 points.
If that 1800 player loses games to three 1600 players in the same section, his rating drops to 1746 or 54 points.
If that 1600 player defeats three different 1800 players in the same section, his rating goes up to 1719 or 119 points. (About 49 of those are bonus points, but they’re only earned if the 1600 player doesn’t play the same 1800 player three times.)