Ratings, Golden database and player expectations

It is my belief that as a TD, I am supposed to use the rating published in the current USCF golden database for pairing all players in every tournament that I run. That rating is a player’s “official” USCF rating until the next golden database is promulgated. I am pretty sure I am correct about this, but the closest I was able to find was on page 84 of the 7th edition rulebook, which indicates that you should download the database to the pairing program, or consult the USCF website. That is ambiguous guidance.

I also know from personal experience that I can submit a set of tournament results and within a day or two, if my players have signed up for it, they will receive an email from the USCF indicating that their torment performance has been rated and their new rating is now XXXX as a result. Thus, I think the rating changes between updates to the promulgated database.

My initial question then, is which rating should I use for parings, the most current one available from the website (assuming I have access to the internet at the tournament site), or the Golden Database rating?

Is there anywhere I can point a player or a parent to that explains this?

Look at rule 28C. It says to use the last published rating. You can find that in the most recent rating supplement or in a player’s MSA rating supplement tab.

If a player does not have a published rating then 28D3 allows using the most recent MSA rating.

If you want to always use the most recent MSA rating (or the higher of the published or most recent) then that is a major variation that should be announced in all publicity.

It is acceptable (and not a major variation) for a TD to assign that most recent MSA rating if the player requests it AND the most recent rating is higher than the published rating.

28E2c can also be used for certain cases to combat sandbagging. The CCA minimum rating list and the scholastic nationals adjustments are some examples of this.

I can see some possible ambiguity in this wording. The ratings for April 2020, e.g., will be published in late March and will appear on a member’s page on the website to the right of the ratings from the March supplement. So does that mean a TD should start using the April ratings in late March?

Bob

Not sure where it has been documented but the March ratings list is considered ‘official’ for event that begin in March, the February list is considered official for events that began in February, unless an event has advertised that it will use a different ratings list or unofficial ratings.

Organizers and TDs may, within the limits noted, use more current (eg, unofficial post-event) ratings information than the official published ratings for that event, but I believe that is considered a variation in the rules that requires advance notice.

I found on page 50 of this month’s CL what you may be looking for in documentation; Rating Supplements is the title of the paragraph. Or perhaps rule 28C?

28C says to use the rating list specified in tournament life in chess life and tournament life says the rating list for a specific month is to be used for tournaments starting in that month. Thus a rating list published two weeks before the start of a month is to be used only once the month starts (that is the standard but advance notice of a variation can adjust that - also, 28E2 can be used to assign a rating - normally an anti-sandbagging move but it may, or may not, be used at a player’s personal request to adjust their own rating as long at 28E1 is not violated [so there should be no assignment less than the rating that would otherwise be used]).

The Golden database isn’t available until each December is it?

Incorrect. The golden database is created every month. This has been true since April, 2008.

Wow, I’m behind a few years, then. Thanks for the correction.

Terry, you might have been thinking about the annual ratings list, which was usually the December list. We still create it and post it online.

The Golden Master file dates back even earlier than 2008, but we don’t have those online, I’m not sure if we still have them in digital format. I think Laura Martz created the first one in around 2001.

The Golden Master actually has many more players on it than the annual list does, because it includes players who don’t have a published rating or who haven’t played rated chess in the last year. The total count on the Golden Master recently crossed the 1 million mark for current and former members.

The 2019 Annual List file has about 86,000 players whose rating changed during the 12 months from January 2019 through December 2019. Of those, about 76,800 played at least one regular or dual rated game in those 12 months. The rest saw their rating change due to rerating, usually resulting from corrections to events in prior years, or played only in quick, blitz or online events.

A typical monthly supplement update file has around 23,000 players in it, just those whose rating changed in the last month.

Nolan, thanks. I remember now the GD. It’s that I haven’t had occasion to use pairing software for a few years now, and things get foggy. I was, indeed thinking of the Annuals.

The Golden is missing the number of games for provisional players. My personal feeling is that is a small price to pay to get every member ID and whether or not it was current when the Golden was created.

I’m not sure why Laura didn’t include provisional game count. Changing it probably wouldn’t be difficult on our end, but my understanding is that might break some versions of WinTD and I’m not sure what SwisSys would do with it.

The golden database is missing a lot of things. Unfortunately the current pairing programs probably aren’t smart enough to handle the inclusion of additional fields even if they don’t use them. We really need a new consistent file.

It won’t matter to SwissSys. The program is remarkably resilient with respect to the database fields. That’s why my combined US Chess and FIDE database works.

Tom Doan would have to answer about WinTD. Back when we only had the bimonthly supplement upload we would upload in the provisional games since they were part of the bimonthly. Maybe adding the provisional games would be okay.

What information would you add?

To help in the discussion, here are the current contents of the two files.

Here’s the monthly rating supplement file:

field 1 R_MEM_NAME type C length 27 0 decimal places offset=1
field 2 R_MEM_ID type C length 8 0 decimal places offset=28
field 3 R_EXP_DATE type D length 8 0 decimal places offset=36
field 4 R_STATE type C length 2 0 decimal places offset=44
field 5 R_RATING1 type C length 7 0 decimal places offset=46
field 6 R_RATING2 type C length 7 0 decimal places offset=53

And here’s the Golden Master file:

field 1 MEM_ID type C length 8 0 decimal places offset=1
field 2 MEM_NAME type C length 17 0 decimal places offset=9
field 3 EXPIRED type D length 8 0 decimal places offset=26
field 4 STATE type C length 2 0 decimal places offset=34
field 5 RSUPP_YR type C length 4 0 decimal places offset=36
field 6 RSUPP_NUM type C length 2 0 decimal places offset=40
field 7 R_PLR_TYP type C length 1 0 decimal places offset=42
field 8 R_LPB_RAT type C length 4 0 decimal places offset=43
field 9 R_NRM_DAT type C length 2 0 decimal places offset=47
field 10 Q_PLR_TYP type C length 1 0 decimal places offset=49
field 11 Q_LPB_RAT type C length 4 0 decimal places offset=50
field 12 Q_NRM_DAT type C length 2 0 decimal places offset=54

A few things stand out. The monthly file has a 28 character name field, the Golden Master has a 17 character name field. Each record in the monthly file is 60 characters, each record in the Golden Master file is 56 characters. That means every character added to the Golden Master file would increase the total file size (unzipped) by over a megabyte. (As we’ve already seen with the test of an XML formatted file, using either XML or JSON format instead of DBF would increase the total file size significantly.)

The Golden Master file indicates the month and year of the rating, but the rating is a 4 character field, while the monthly file has no date (since it is for that month) but has a 7 character field for ratings, which allows a /nn indicator for provisional ratings. The PLR_TYP fields in the Golden Master file have a U for unrated, E for established and P for Provisional. I’m not sure if the NRM_DAT fields are currently used for anything.

Neither file has a ‘status’ field to indicate deceased players or duplicated IDs.

Each file only shows two ratings, we now have 5 types of ratings, 6 if you count correspondence chess.

In no particular order, I would include at the very least (some you mentioned already):

[]The full name (long version) - preferably as two separate fields (last, first), but even in a single field it’s fine.[/]
[]ID[/]
[]Expiration[/]
[]Player status - that way duplicate, etc can be filtered out if desired[/]
[]Regular, quick, and blitz ratings. It’s debatable whether or not the online and correspondence ratings are needed in this, but I’m not opposed.[/]
[]Provisional numbers (if included, should be in a separate field for ease of import) - can simply be empty if rating is established[/]
[]FIDE ID[/]
[]FIDE title[/]
[]FIDE Federation[/]
[]Gender[/]
[]State (3 character version) - those that are not in the 50 states are blank currently[/]
[]Or alternatively to 3 char version of state - Country[/]

XML format is way too verbose for this. Even JSON is a bit much. Frankly just a simple CSV or tab delimited file would be fine where the first line represents the field headers. I’d suggest a note that Fields may be added or moved. Programs should be flexible enough to handle any changes in the format of the file. That’s a pretty easy thing to do. Tab delimited is probably better so you won’t need to escape any fields that include commas - saves some file space.

When using tab delimited files in FTP sends I’ve had to strip out tabs in the data (usually happens when people copy/paste data from elsewhere and accidentally include a trailing tab - possible in a name field using MSA or TD/A entry).

Amen to the two separate fields. Otherwise the software needs to parse it if it wants to alphabetize names. “May Kay Jones” is probably “Jones, Mary Kay” (3, 2 1) but “Vincent Van Gogh” is probably “Van Gogh, Vincent” (2 3, 1). The software should not be expected to guess right all the time.

If the overall file is tab-delimited, but the name field is comma-delimited within itself, the name field could still be a single field from a tab-delimited point of view. This would also allow postfixes to be handled nicely, as in “Jones, Robert, Jr.”. There could even be multiple postfixes, like “Jones, Robert, Jr., Esq.”.

Year first, please, e.g. “2020-05”, in case any software wants to sort things in order of expiration date.

Yes, why not all 5 (or 6 with correspondence).

Perhaps even with a slash / in front for clarity.

Well thought out. Kudos.

Agreed. Is tab-limited directly loadable into Excel, as comma-delimited is?

Bill Smythe