The Wilcoxon test loses power because, even if there are no ties in any of the senses mentioned earlier, there are still many rank-sums that tie with each other. For instance, an S of 6 might be 1+2+3, or might be 2+4, or might be 1+5, or might represent a single rank of 6. Recall that p is the probability of finding a value of S smaller than or equal to the one observed. Thus the fact that four different patterns in this example all yield S-values of 6, means that all these patterns are counted in the p computed when any one of the patterns is observed. That raises the values of p and thus lowers the power of the test. An alternative test, based on normal scores, avoids this disadvantage.
When either the Wilcoxon test, or the alternative normal-scores test, is applied to gain scores, surprisingly it does not allow the conclusion that the pretest and posttest scores differ. Consider for instance the following set of hypothetical data:
| Pretest scores | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| Posttest scores | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 1 |
| Gain scores | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -11 |