Swiss Pairing Thought

narceleb · March 7, 2013, 2:11am

A predictor-corrector algorithm, such as a Kalman Filter, would handle the rising player better than our current system does. Additionally, such a system would bring in the uncertainty (“plant noise”) and increase the covariance over time. That means that the quickly-rising player’s rating would have a larger covariance, change faster accordingly, AND affect other’s (i.e., old people with plateaued ratings) less. The rising player’s rating would be a measurement with large covariance, thus affecting the state (opponent’s rating) less than do measurements with small covariance.

Ain’t math fun! (This is why PhD candidates write papers about rating systems!)

narceleb · March 7, 2013, 2:13am

If you would prefer those matchups to be in the early rounds, do 1-2 Pairing.

narceleb · March 7, 2013, 2:15am

I told my daughter how Viktor Korchnoi swears by oatmeal. She now eats oatmeal for breakfast on tournament days!

bioniclime · March 7, 2013, 2:24am

I’m sure you realize this is similar to the Glicko-2 rating system.

wintdoan · March 7, 2013, 2:47am

And why would 1-2 pairings in the early rounds do that, pray tell?

narceleb · March 7, 2013, 2:49am

Not quite. The Glicko and Glicko-2 systems do not have a prediction step.

wintdoan · March 7, 2013, 3:27am

narceleb:

jwiewel:

wintdoan:

narceleb:

If one is over-rated or under-rated, then by definition the rating system is inaccurate for that individual.

Again, incorrect. The rating is inaccurate, which is why it’s adjusted by the rating system when the results show that to be necessary.

And there can be multiple reasons for the inaccuracy.

A rising player has not yet had the rating catch up to the increasing strength

A player is have a good or bad day and playing differently from normal for that player (sick, tired, confident, whatever)

The time control is better or worse for the player than the rating would indicate

etc.

A predictor-corrector algorithm, such as a Kalman Filter, would handle the rising player better than our current system does. Additionally, such a system would bring in the uncertainty (“plant noise”) and increase the covariance over time. That means that the quickly-rising player’s rating would have a larger covariance, change faster accordingly, AND affect other’s (i.e., old people with plateaued ratings) less. The rising player’s rating would be a measurement with large covariance, thus affecting the state (opponent’s rating) less than do measurements with small covariance.

Ain’t math fun! (This is why PhD candidates write papers about rating systems!)

Actually, the rating system is basically a non-linear filter (not Kalman, since the measurement equation isn’t Gaussian). If you actually were to simply apply the Kalman filter (using a quadratic approximation to the measurement equation), it would do none of those wonderful things that you say. The Kalman filter, as a Gaussian updating procedure, has the property that the precision of posterior is always greater than the precision of the prior even if the prior and the data disagree strongly with each other. Run it long enough and everyone’s rating is apparently estimated exactly. To prevent that, you have to intervene in some way to keep the filter from collapsing. We do that by restricting how small the variance can go based upon the rating—low ratings are considered to be less accurate than high ones with a formula governing that that was based upon a loose curve fit to the apparent precision of the ratings. There are other “non-Kalman-update” features like bonus points. Could you make other “non-Kalman” interventions? Sure. But to say that a Kalman filter would do better is wrong, both mathematically and practically.

wintdoan · March 7, 2013, 3:41am

Prediction step: R(t)=R(t-1),sigma(t)=some function of sigma(t-1), time since last tournament, other stuff.

narceleb · March 7, 2013, 3:53am

R(t)=R(t-1) is not a prediction – it’s just using the last known rating.

A prediction is of the form R(t+∆t) = f(R(t), R’(t), ∆t).

Glicko does not even estimate R’(t). Thus, it is a one-state filter which cannot do prediction.

wintdoan · March 7, 2013, 4:52am

I’m using R(t) to represent the prior mean. If x(t) is the unknown truth strength (which is what it looks like you mean), then the assumption underlying the current USCF system is x(t+∆t)=x(t), and the assumption underlying Glicko is x(t+∆t)=x(t)+v h(∆t), where v is mean zero random variable and h is some deterministic function of time. Both of those are perfectly good dynamical systems, and in both cases the prediction step for filtering is R(t+∆t)=R(t), with sigma(t+∆t)=sigma(t)+stuff.

Smythe_Dakota · March 7, 2013, 12:21pm

“Um, it was my understanding that there was to be no math …” (Chevy Chase, as Gerald Ford in a presidential debate on an SNL skit a few decades ago.)

Actually, it wouldn’t. After one round, the 1-point score group would have all the players due white at the top, and those due black at the bottom. They would be paired against each other, in true top-half-vs-bottom-half fashion.

Anyway – if the objective of your tournament is to reduce the number of perfect scores at the end, there are better ways to do this than the Swiss.

During the Fischer boom around 1972-1973, there was a huge (for its time) scholastic tournament in Chicago, just 5 rounds for at least 300 players, all in one big section. The first 3 rounds were paired as a straight Swiss (not even accelerated pairings).

So, with just 2 rounds to go, there were still dozens of perfect scores. Organizer Richard Verber then took over the pairings from his TD. In round 4, he extended the “bottom player in group A plays top player in group B” idea (normally used when there are an odd number of players in group A) by pairing the bottom whole-bunch-of 3.0s against higher-rated opponents with 2.5, 2.0, and even 1.5. He was hoping that the large rating differences would clobber the score differences in the pairings.

Verber was right. The lower-rated 3-pointers lost massively to their lower-scoring but higher-rated opponents. After just that one round, there remained only 3 players perfect at 4.0.

In round 5, the top two 4.0s drew each other, while the third (much lower-rated) lost to a higher-rated opponent in a lower score group. So the tournament ended up with zero perfect scores.

Quantify that.

Bill Smythe

bioniclime · March 7, 2013, 12:30pm

When I said “wins with black” is a possible tiebreaker, I was merely pointing out that such a tiebreaker an example of one that is not going to be the same at the end of one round. I was not suggesting that this be the first or only tiebreaker used.
My discussions of using tiebreakers instead of pre-tournament ratings to order players within a score group was simply a question challenging the status quo, since there (so far) seems to be no theoretically justifiable and/or documented reason for the status quo.

mregan · March 7, 2013, 2:55pm

I can’t see how someone who worries about the accuracy of ratings is also proposing to use tiebreaks to determine rank within a score group. Tie-breaks for the first few rounds are pretty close to random. For example, after the first round everyone is the 1-0 score group played a person in the 0-1 score group. The black wins tie-breaks only gives you two values. So, it really doesn’t help much. Basically your left with rating.

Later in the tournament you will be determining rank based on the records of player’s opponents. While we justify this as a way to determine trophies it is a very noisy measurement of ability.

The interesting idea that was proposed before is to try to give everyone within a score group the same average rating of opponents. The thing I like about that system is that it seeks to minimize the advantage of the higher rated players who get paired down much more than lower rated players. Just look at the average rating of opponents in an even score group in a large open swiss. On average, the low rated players have played much higher rated players. This is one of the things that led to Greg Shadade’s proposal of random pairings within a score group.

Mike Regan

nolan · March 7, 2013, 4:16pm

I’m just spitballing here, but what if instead of upper half vs lower half we ALWAYS divided each score group into 4 parts and paired group 1 vs 2 and group 3 vs 4?

In the first round, this would be the same as accelerated pairings.

For a 16 player event we would have:

(These are not balanced for color.)

1 vs 5
2 vs 6
3 vs 7
4 vs 8
9 vs 13
10 vs 14
11 vs 15
12 vs 16

Assume the higher rated players all win.

In round 2 we have:

1 vs 3 (1 point group)
2 vs 4
9 vs 11
10 vs 12

5 vs 7 (0 point group)
6 vs 8
13 vs 15
14 vs 16

In round 3 we would have 1 and 2, 9 and 10 in the 2 point group, 3, 4, 11, 12, 5, 6, 13 and 14 in the 1 point group, 7,8 15 and 16 would be in the 0 point group.

So in round we have:

1 vs 2 (2 point group)
9 vs 10

3 vs 5 (1 point group)
4 vs 6
11 vs 13
12 vs 14

7 vs 8 (0 point group)
15 vs 16

Seems to me we’ve done a pretty good job of matching up players against similarly ranked opponents in all three rounds. Is this an improvement over traditional accelerated pairings?

Round 4 would have 1 vs 9 in the 3 point group, the first time an upper half player has faced a lower half player. But by then I would expect several draws to have occurred. And in larger tournaments, I don’t think we’d see any upper half vs lower half pairings until at least round 5.

narceleb · March 7, 2013, 4:37pm

wintdoan:

Actually, the rating system is basically a non-linear filter (not Kalman, since the measurement equation isn’t Gaussian). If you actually were to simply apply the Kalman filter (using a quadratic approximation to the measurement equation), it would do none of those wonderful things that you say. The Kalman filter, as a Gaussian updating procedure, has the property that the precision of posterior is always greater than the precision of the prior even if the prior and the data disagree strongly with each other. Run it long enough and everyone’s rating is apparently estimated exactly. To prevent that, you have to intervene in some way to keep the filter from collapsing. We do that by restricting how small the variance can go based upon the rating—low ratings are considered to be less accurate than high ones with a formula governing that that was based upon a loose curve fit to the apparent precision of the ratings. There are other “non-Kalman-update” features like bonus points. Could you make other “non-Kalman” interventions? Sure. But to say that a Kalman filter would do better is wrong, both mathematically and practically.

Absolutely! That’s what “maneuver detection” algorithms are for!

nolan · March 7, 2013, 4:53pm

Seems to me like this thread needs to be split into two, one dealing with the theoretical math and one dealing with pairings issues.

Crume · March 7, 2013, 5:39pm

About three pages too late …at this point just rename the thread!

bioniclime · March 7, 2013, 7:56pm

For what it is worth, I think you’re conflating two separate people. I was the one who suggested ranking by tiebreaks. I’m not at all worried about rating accuracy (that was someone else).

bioniclime · March 7, 2013, 8:06pm

In round 4, you would have:

3 points
1 v 9

2 points
2 v 3
4 v 10
11 v 12

1 point
5 v 6
7 v 13
14 v 15

0 points
8 v 16

(Note for the 2 and 1 point groups, it is impossible to break it up into four pieces, so I guesstimated as to what you might want.)

Assuming all the higher rated players won and the tournament was 4 rounds, the final standings would be:

1: 4
2, 4, 9, 11: 3
3, 5, 7, 10, 12, 14: 2
6, 8, 13, 15: 1
16: 0

This doesn’t make much sense to me – judging from a fairness in pairings standards – as player 11 (for instance), was almost always paired down, just by virtue of the fact that player 11 was at an advantageous spot in the wall chart (i.e., in the third quartile three of four times).

Am I interpreting this correctly?

wintdoan · March 8, 2013, 12:58pm

The top half vs bottom half in order top to bottom will maximize at sum and also the product of expected scores of the top half players (due to the concavity of the expected score function for positive values and and the log expected score).

Topic		Replies	Views
Swiss Pairing Question Running Chess Tournaments	25	1025	February 24, 2007
Pairing Systems Running Chess Tournaments	20	5895	November 28, 2017
Best Tournament Pairing Systems Running Chess Tournaments	38	746	February 13, 2012
Team Tournament Pairings Running Chess Tournaments	56	18867	March 12, 2013
Random Pairings a la Shahade Running Chess Tournaments	17	977	February 22, 2008

Swiss Pairing Thought

Related topics