The sign test from exact numbers of combinations

Copyright © Richard B. Darlington. All rights reserved.

Suppose you have observed 10 cases above a hypothesized median and 2 cases below it. You want to test whether the difference between these two frequencies would be likely to be this large or larger if the hypothesis were correct.

The number of ways that 12 cases can be divided into groups of size 10 and 2 is 12!/(10! 2!) = 66. However, a significance level p is defined as the probability of finding an equal or larger deviation from the null hypothesis than the observed deviation. Therefore you must also compute the number of ways that 12 cases can be divided into groups of 11 and 1, or into groups of 12 and 0. The former of these is 12!/(11! 1!) = 12 while the latter is 12!/(12! 0!) = 1. Thus the total number of ways that you might have found 10 or more cases above the hypothesized median is 66+12+1 = 79. The total number of ways you could have divided 12 cases into two groups, regardless of the sizes of the groups, is 212 = 4096. Under the null hypothesis, all these various ways are equally likely. Therefore p = 79/4096 = .0193 (one-tailed).

The general rules for this calculation are:

  1. Compute N!/[X! (N-X)!], where N is the total number of cases above or below the median and X is the larger of the two observed cell frequencies. This is called the number of combinations of N things taken X at a time, and is sometimes denoted C(N,X).
  2. Recompute C(N,X) after increasing X by 1, and keep doing so up through X = N. Note that 0! = 1.
  3. Sum the various values of C(N,X).
  4. Divide this sum by 2N to find the one-tailed p.
  5. If a two-tailed p is desired, double the result of step 4.

If N is large it may simplify computations to note that the ratio between any two adjacent values in step 2 is always the ratio of two integers. For instance, [12!/(11! 1!)]/[12!/(10! 2!)] = 2/11 and [12!/(12! 0!)]/[12!/(11! 1!)] = 1/12. Therefore tables or formulas for C(N,X) are not really needed in step 2 even if they are needed in step 1.

If you are familiar with the general binomial formula you may have noticed how much simpler the procedure here is than that general formula, which includes powers of the null proportions P0 and (1 - P0). These powers can be omitted here because P0 = .5 = 1 - P0. Therefore every term in the numerator or denominator includes P0N, so these values all cancel out.