The logic behind the test

Copyright © Richard B. Darlington. All rights reserved.

For simplicity assume no ties of any sort. In every sample the N ranks will range from 1 to N. The only thing that is in doubt is which ranks will be attached to scores below C and which to scores above. Think of scores below C as having a - sign attached to their ranks while scores above C have a + sign attached. Since each of the N ranks can get either a - or a +, there are altogether 2N ways that signs can be attached. If scores are distributed symmetrically around C, then all these ways are equally likely. Therefore to calculate the significance level p for any one of these ways, we must rank the 2N patterns by their values of S, and count the proportion of patterns at or below the observed pattern.

For instance, if N = 3 then there are just 8 possible patterns:

1 2 3 S+ S-
- - - 0 6
+ - - 1 5
- + - 2 4
+ + - 3 3
- - + 3 3
+ - + 4 2
- + + 5 1
+ + + 6 0

The second row in this table, with signs + - -, refers to the case in which the score closest to C (with a rank of 1) falls above C (and therefore gets a +), while the other two scores fall below C. Column S- shows the sum of the ranks with - while column S+ shows the sum of the ranks with +. The 8 possible outcomes are arranged in decreasing order of S- and increasing order of S+. Under the null hypothesis of symmetry, all 8 possible outcomes are equal likely. Since two outcomes yield S- values of 5 or higher (or equivalently S+ values of 1 or lower), the one-tailed p associated with S- = 5 is 2/8 or .25.