You confirmed that statement in practice with your example, but it’s also true if you look at it in theory.
It’s easiest to see if you use the old 16 plus-or-minus 4% linear approximation, which corresponds roughly to a K-factor of 32. Under that approximation, a 1500 who draws a 1700 will gain 8 rating points, because 8 is 4% of the 200-point rating difference. (The “16 plus-or-minus” part doesn’t enter the picture if the game is drawn.)
Suppose the 1500 enters a 25-round tournament and gets 25 draws, all against opponents rated 1700. He will gain 8 points per game, for a total gain of 8 times 25, which is 200 points. His post-tournament rating will be 1700.
Meanwhile, his performance rating (i.e. his rating based on just the one event) will also come out 1700. (What else? He has just drawn 25 players, all rated 1700.)
So, with a 25-round tournament, the player’s rating has caught up completely with his performance rating. The 200-point “error” has worked its way out within 25 games, just as you said above.
It follows that, with a 30-round tournament for example, the player’s new rating would actually overshoot his performance rating. The player will end up at 1740, forty points higher than he “deserves”. That’s one reason (I suppose) that the US Chess rating software limits the number of rounds per tournament. (I think the limit is 14 or 20 or something like that.)
Going back to the 25-round example, if the tournament is broken up into five 5-round events, then the correction will come more slowly, because the post-event rating for each 5-round event becomes the pre-event rating for the next event. The player will jump from 1500 to 1540 in the first event, but it will not go to 1580 in the second, because the rating difference (between the player and his five 1700 opponents) is now only 160 instead of 200. 4% of 160 is 6.4 (instead of 8.0), times 5 rounds is 32 (instead of 40), so after his second event he will be at 1572 instead of 1580.
Because of this smoothing action, the number of games required to completely wipe out any error will typically be a little more than 25 – perhaps around 35 or so.
Bill Smythe