Pairing Question

In “typical” transpositions NOT involving floats, it seems wrong (and completely contrary to USCF rules) to sum the two differences. The rule is clear – only the smaller of the two differences should count.

Now when it comes to floats, if you want to downfloat the highest instead of the lowest, the only relevant difference should be the difference between the highest and lowest.

Likewise, in upfloating the lowest instead of the highest, the only relevant difference should, again, be the difference between these two players.

So, if both are being done, I can see why both of these differences might be important, but the sum? That seems absurd. At most, it might be reasonable to use the greater (rather than the smaller) of the two differences.

There was a thread a few years ago, in which a pairing program (I don’t remember which one) was forced to make a 250-point transposition because two players had already played each other. Then, having made this transposition, the program then refused to make an additional 1-point transposition to equalize colors, apparently on the grounds that the 200-point limit for equalizing this player had already been exceeded.

That’s the kind of thing that can happen when rules, written in a general way rather than an algorithmic way, are converted into algorithms in pairing programs.

And I’m not sure what a good solution would be. Maybe something along the following lines:

  • First make the “raw” pairings. Raw pairings are easily defined: Within a score group, top half plays bottom half, in sequence. If there is an odd number in a group, bottom player in group A plays top player in group B. Ignore colors. Also ignore whether two players have already played.
  • In each “raw” pairing, assign colors in all the usual ways, especially the paragraph Pairing players due the same color in rule 29E4.

Sometimes raw pairings are called “natural” pairings, but I dislike “natural” because of its emotional content and biased nature, and because it’s hardly “natural” to pair the same players twice.

Then there is the next layer up, which I’ll call “simmered” pairings (or “almost raw”, or “barely cooked”), where transpositions are made only to avoid repeat pairings, still not taking colors into account.

The trouble with simmered pairings is that there is often more than one reasonable way to simmer the pairings. And, if simmered pairings are used as a stepping stone to the final pairings, the “best” simmered pairings may not lead to the best final pairings.

So, in evaluating any proposed set of final pairings, the proposed pairings should be compared to the raw pairings (not to the simmered pairings). Assign to each pairing (both raw and proposed) an “undesirability score”, something like this:

  • If either player has the “wrong” color, this pairing gets 200 undesirability points for a bad equalization, or 80 points for a bad alternation.
  • If the two players have unequal scores, this pairing gets 1000 undesirability points for each half-point difference in the players’ scores.
  • If the two players have already played each other, this pairing gets 10000 undesirability points.
  • If this pairing was different from the raw pairing, this pairing gets undesirability points equal to the transposition value. The transposition value of a pairing is defined as the rating difference between white’s raw opponent and white’s proposed opponent, or the rating difference between black’s raw opponent and black’s proposed opponent, whichever is less.
  • The total undesirability score for a pairing is simply the sum of the above four. The total undesirability score for the whole set of pairings is just the sum of the undesirability scores for each pairing.

Note: The 1000 points for cross-score pairings prevents a program (or a TD) from wantonly making cross-score pairings just to improve colors. Likewise, the 10000 points for a repeat pairing prevents a program, in most cases, from making a repeat pairing.

By calculating the undesirability scores of both the raw pairings and the proposed pairings, the two sets of pairings can be compared. More importantly, two (or more) sets of proposed pairings can be compared. The pairings with the lowest total undesirability score constitute the preferred pairings.

The above method is designed to be compatible with USCF rules, including the 80- and 200-point color transposition limits. I’m not sure it actually is completely compatible, but complete compatibility between any algorithm and USCF rules may not even be possible.

Bill Smythe

I’ve looked at this some more, and I think there’s a problem with Example 5. It contradicts rule 29D1b:

Here is Example 5 (page 160, rule 29E7):

But according to rule 29D1b we’re only supposed to look at the rating difference of the players being switched. Switching the 2050 with the 1980 is a 70 point difference, while switching the 2080 and the 1980 is a 90 point difference (not 50 points as stated in Example 5). Thus, under 29D1b the correct pairings would be:

           White     Black

           2100      2080
           1980      1990
           1800      2050

One interesting thing about this example is the assumption that in order to pair the 3 point group you have to look at the colors in the 2.5 point group. You could fix the colors on the top board by having the 2100 play the 1980 but all that would do is to move the color problem from the 3 point group to the 2.5 point group. This isn’t the “Look Ahead Method” (rule 29E6a), which only applies to improving colors within a score group. Is there anywhere else in the rulebook where this principle of looking ahead at the colors in a lower score group is enunciated?

Regarding the conflict between rule 29D1b and Example 5 on page 160 (rule 29E7): Rules Committee, please fix!

And your point is??? I haven’t seen anyone argue against that.

Correct and correct.

I think it’s safe to say that I have thought about this much more carefully than you have. The sum is the only way to handle this that gives the most intuitive pairings in most cases. Using the greater of the distances rather than the sum means that if the gap is bigger in one group, the choice in the other becomes irrelevant. The basic principle is that ideally the bottom of the higher group plays the top of the lower. Only summing aims to force the pairings towards that.

If it were originally 199, then WinTD probably wouldn’t do the additional switch as it would push it to the 200 point limit. If the original switch were 250, then WinTD would have no problem doing the additional switch. That’s the problem with a hard cutoff. The cutoff in WinTD however isn’t absolute. If a 250 point swap miraculously fixes huge numbers of colors (due to suddenly fixing other pairing problems) rather than just the typical two, it will use it.

That’s what WinTD’s done for years. (It’s a lot more complicated that that, but that’s the general idea). That’s what those scores are in the WinTD log files. Since keeping players in score group or as close to score group as possible takes precedence over color and rating order, the score is dominated by the need to have four pairings with .5 drops. All the other “failures” are less important with differing weights on undesirability. However, you can’t use fixed weights on problems, because if you get into a really big tournament, the ones you intend to be actually lexicographic (such as keeping players in score groups if at all possible) get swamped by the sheer number of color and shift size penalties. So the lower level penalties have to be rejiggered based upon the size of the section.

aaack!! That was an example copied over from the 4th edition with the same flawed logic, except the 5th edition rules rules themselves were corrected so you didn’t compare the ratings of a 2.5 with that of the 3.0.

You could look at it as a switch of the 2080 with the 1990 (90 points) OR as a switch of the 2100 and 2050 (50 points). Either one gives the same final opponents (who are then adjusted for due colors).

The 2100 and the 2050 are in different score groups. How can you justify moving the 2050, with 2.5 points, up into the 3 point group? The only way for the 2050 to play someone with 3 points is to pair him with the odd man, which means switching the 2080 and the 1990.

I like the method Jerry Weikel uses for his Western Open:
“Pairings: To make an odd group even, the highest rated player from the next lower group will be moved up, when applicable. If the unlikely situation occurs, three blacks in a row could be assigned. Pairings not changed for color alternation unless three in a row or cause a plus 3.
TOURNAMENT DIRECTOR
Jerry Weikel, Chief T.D. (N.T.D.)”

Tom, Can WinTD be set for this method?

Others have noticed this before. I think the consensus has been, follow the rule, not the flawed example.


Yup, a little looking ahead is frequently a smart idea. Sometimes, at least in small tournaments (or small sections), it may be a good idea to look ahead all the way to the end, i.e. to the 0-point score group.

I don’t think there is, but it’s certainly something TDs (and pairing programs) ought to be doing.

The whole concept of “The Look Ahead Method” (each word capitalized) is, IMHO, treated a little too formally in the rulebook, as though it were a Complete Algorithm designed to replace a Standard Algorithm. Instead, “looking ahead” (lower case) ought to be presented as a general concept, which can be used in a whole host of ways to improve pairings overall.

In some cases, even “looking ahead” to the next round can be desirable, at least in very small sections (say, 10 players or fewer). Looking ahead from round 3 to round 4 would avoid the infamous 6-player 4th-round trap that pops up from time to time in these forums (and in real life, as Mike Ditka would say), and it would often improve colors enormously in an 8-player 4-round event.

Bill Smythe

Was this an exact quote? The above is actually three separate, completely unrelated policy statements, two of which (as quoted above) are overly simplistic and vaguely stated, and in at least one case, bordering on the downright incorrect. And running all three together into one short paragraph creates a hodge-podge.

Taking them one at a time:

What does this mean, exactly? If you simply treat the highest from group B as though he were the lowest in group A, in effect you are downfloating the middle player from group A. This is nothing more than the old Harkness method, which still has its advocates but has largely been discredited in recent editions of the rulebook, and by pairing programs and most TDs.

If, on the other hand, you simply move the highest from group B up into group A according to his rating, then who knows what would happen? The upfloated player might end up in either the top half or the bottom half of group A. In effect you’d be completely ignoring this player’s score when pairings are made.

OK. This appears to mean only that, on rare occasions, three blacks in a row might be assigned. So what?

Changing the transposition limit for alternation from 80 to 0 in small tournaments is something I have long advocated, as it tends to improve colors in the equalization (even-numbered) rounds. But it seems a bit extreme in large events, where the two “camps” (due-white and due-black) are large, and inter-camp pairing possibilities remain plentiful even in the later rounds. In a large event, ignoring equalization in round 3, for example, would create a large number of forced colors in round 4 (many players with WBB or BWW) which could make round 4 pairings difficult.

As for “cause a plus 3” (meaning, I assume, resulting in 3 more blacks than whites or vice versa), that’s an equalization issue, not an alternation issue.

Bill Smythe

Yes, from the Western Open TLA. I believe he has for many years used this pairing method and stated that the players should play the opponent they should play irregardless of color alternation.

Those might be Harkness pairings (the ones I did automatically when I first started doing pairings and had not read the rulebook - ah, the overconfidence of ignorance). Harkness drops the middle player instead of the bottom player, generally resulting in closer match-ups by rating. If you have seven players at 2-0 and a 1.5-0.5 (maybe a first round 1/2-point bye) then I thought of it as bringing a player up and pairing him as the bottom of the scoregroup (#8 of 8 paired against #4 which Harkness does by dropping that middle #4 of those seven players) rather than the common method of pairing him against the bottom of the remaining scoregroup (#8 of 8 paired against #7).
WinTD can do this by going into pairing options and turning on the Harkness radio button.

If Jerry’s method is to bring the upfloated player in and then pair the section ignoring scores (so the upfloated player might be #2 of 8 and paired against #6) then WinTD does not have that option.

That’s not really “this” pairing method, it’s “these three” pairing policies. He really should put each in a separate paragraph.

If he wants to use the Harkness pair-down method, he’s got company, although that company has been dwindling over the years.

If he wants to avoid 3 blacks in a row whenever possible, but occasionally finds it impossible to avoid, well, what else is new?

Not transposing for color alternation is an idea not without merit, but its primary benefit is in small events, and it may be a bit questionable in larger events.

Bill Smythe

Setting the alternation limit to 0 would prevent switches simply to fix color alternation.

As Jeff points out, moving the top player from the lower section up isn’t well defined—is he treated as the lowest player in the higher section (which would give Harkness pairings, which is a preference item under “Pairing Rules”) or is he melded into the score group in rating order? The latter you can’t do with WinTD. Also, even with Harkness pairings WinTD would conceivably pick a different upfloater if it improved equalization in the two score groups. And if “Avoid Three in a Row” is on it would apply to the upfloater decision as well as in the score groups.

But why would this be either better or worse than its mirror-image method, i.e. downfloating the bottom of group A against the middle of group B?

And, if you want closer match-ups among floats, why not adopt a more balanced approach, halfway between Harkness and mirror-image Harkness? You could downfloat the player at the 3/4 mark in group A and upfloat the player at the 1/4 mark in group B.

Or better yet, adjust the 3/4 and 1/4 slightly, as circumstances warrant, with the idea of making the rating differences look more like those in the non-float pairings. In some cases, for example, you might want to downfloat the 7/8 and upfloat the 1/8, in other cases downfloat the 5/8 and upfloat the 3/8, etc.

As well it shouldn’t. If it’s done that way, at least the upfloated player’s “rating” (for pairing purposes in this round only) should be decreased by, say, 100 points to compensate for his missing half-point.

Bill Smythe

Harkness pulls the top player in the next score group up, thus (in most cases) making the top group stronger in median rating. The “mirror image” pulls the middle player in the next scrore group up, thus (in virtually all cases) making the top group weaker in median rating. I think most players in the top group would be unhappy to see the pairing that the downfloater got.

Harkness pulls the top player in the next score group up, thus (in most cases) making the top group stronger in median rating. The “mirror image” pulls the middle player in the next scrore group up, thus (in virtually all cases) making the top group weaker in median rating. I think most players in the top group would be unhappy to see the pairing that the downfloater got.

By mirror-image reasoning, in regular Harkness, most players in the lower group would be delighted to see the pairing the upfloater got. Is this any better, allowing the players in the lower group to chortle over what has happened to their highest-rated player?

Bill Smythe

Yes. It is. The fact that “Smythe pairings” weaken the higher group (by design!!!) should be enough to eliminate them from serious consideration.

Harkness does reduce any benefit received from a “Swiss Gambit”, which is often undertaken unintentionally.
Standard (#7 of tied 1-7 vs #8) is generally better in reducing the number of top scores.
Harkness (#4 of tied 1-7 vs #8) gives #8 a pairing that is usually more in line with the difficulty of the other players in his scoregroup, while making it more likely that the higher-ranked player will win and thus have more top scores than otherwise.
Smythe (#7 of tied 1-7 vs #12 of tied 8-16) gives #8 a pairing within his scoregroup and gives #7 an easier pairing than versus #3 or #8. It is as inefficient as Harkness with reducing top scores while giving #7 the benefit of a “Smythe Gambit”.

Also, would Smythe mean that the sole 3-0 would play the middle of three tied 2.5-0.5 players, thus avoiding the strongest competition going into the final round of a 4-round event?
In that particular case Standard and Harkness would both pair the 3-0 against the top 2.5-0.5 (colors allowing).