Copyright © Richard B. Darlington. All rights reserved.
To understand lambda, imagine rational judges trying to guess the column membership of cases drawn from a sample, when each judge knows the cell frequencies for the sample. Each case is drawn exactly once and then replaced, but the order in which cases are drawn is random. An "informed" judge is told the row membership of each case before guessing, while a "blind" judge is not. Both judges try to maximize the number of correct guesses. Then lambda is the difference between the numbers of correct guesses of the two judges, expressed as a proportion of the difference between the blind judge and a hypothetical perfect judge.
The rational blind judge will always guess the column with the largest total frequency, since that gives him or her the highest chance of being right. For similar reasons, the rational informed judge will always guess the column with the largest cell frequency in the row in which he or she knows each case to fall. For instance, consider the frequency table
| I | II | III | IV | |
| A | 36* | 11 | 15 | 21 |
| B | 14 | 28 | 51* | 32 |
| C | 18 | 45* | 32 | 12 |
| Total | 68 | 84 | 98 | 65 |
The blind judge would note that column 3 has the largest total frequency, so that judge can maximize the number of correct guesses by guessing column 3 for all cases, thus making 98 correct guesses. The highest frequency in each row is starred; the informed judge would make those guesses, and would thus make 36+51+45 or 132 correct guesses. A hypothetical perfect judge would be correct for all 315 cases. Therefore lambda = (132 - 98)/(315 - 98) = .157.
Lambda has a frequency interpretation. In our terminology, lambda defines the target set of cells to include the cell with the highest frequency within each row. Once the target set is defined, lambda fits the basic formula (observed - null)/(max - null) required for a frequency interpretation. Lambda has difference proportionality, but lacks unique zero.
| I | II | III | |
| A | 50 | 49 | 1 |
| B | 50 | 2 | 48 |
| Total | 100 | 51 | 49 |
This table shows substantial association in one sense, but lambda is zero because the same column (column I) has the highest frequency in each row. Thus the judge who knows row membership will make the very same guesses as the judge who does not, and lambda will be zero.
If you think of judges as getting points for the accuracy of their guesses, then lambda assumes that each judge receives one point for every correct guess. In MR we imagine that the judge doesn't merely guess one column, but rather ranks the columns in the order of their probability, with the least likely column ranked 1. Then the judge receives a number of points equal to the rank the judge assigned to the column the case was actually in. Otherwise MR is defined like lambda; MR equals the difference in points received by informed and blind judges, expressed as a proportion of the difference between perfect and blind judges. Thus MR has a weighted frequency interpretation.
For the 2 x 3 table shown above, the blind judge works with just the column totals, basing guesses on the fact that columns 1, 2, 3 have successively declining totals. The blind judge thus assigns a rank of 3 to column 1, 2 to column 2, and 1 to column 3. The column totals are respectively 100, 51, 49, so the number of points this judge receives is 3*100 + 2*51 + 1*49 = 451. The informed judge receives 3*50 + 2*49 + 1*1 or 249 points for guesses in row 1 plus 3*50 + 2*48 + 1*2 or 248 points for guesses in row 2, or 497 points total. The hypothetical perfect judge receives 3 points for each of the 200 cases, scoring 600. Thus MR = (497 - 451)/(600 - 451) = .3087. Recall that lambda for this table is 0. However, MR always equals lambda when there are only two columns.
Both lambda and MR lack the unique-zero property. The table
| 99 | 1 |
| 51 | 49 |
Lambda is affected far more by cell frequencies well above their expected values than by those below, while MR is more symmetric in this respect. Consider for instance three 6 x 6 tables. Except for the cells in the upper-left-to-lower-right diagonal, suppose each table has a frequency of 100 in each cell. In that diagonal, all entries in table A are 0, all in table B are 100, and all in C are 200. Table B exhibits exact independence, and in that table both lambda and MR are 0. In table C, both measures are .143. But in table A, MR is higher still at .200, while lambda is only .04. Yet proportionally speaking, the A-B difference between 0 and 100 is far larger than the B-C difference between 100 and 200. This property of lambda seems odd and undesirable for many purposes.
We illustrate with the last 2 x 2 table above. The blind judge notes that 150 of the 200 cases are in column 1 and thus assigns probabilities of .75 and .25 to the two columns. The points won by this strategy are 150*.75 + 50*.25 or 125. The informed judge assigns probabilities of .99 and .01 to the two columns for row 1, winning 99*.99 + 1*.01 or 98.02 points for that row. For row 2 this judge wins 51*.51 + 49*.49 or 50.02 points, making 148.04 points altogether. The hypothetical perfect judge wins one point for each of the 200 cases. Thus MP = (148.04 - 125)/(200 - 125) = .3072. Both lambda and MR are zero for this table.
MP can be criticized on the ground that the imaginary judges we have described in this section are not following strategies rationally designed to maximize their expected winnings. For instance, if a judge thinks the probability of choice A is .75, but learns that he or she will win points proportional to the probability he or she assigns to the correct choice, then it can be shown that the judge's optimum strategy is actually to state a probability of 1 for choice A. That objection does not apply to the Uncertainty statistic described next, and we have found that MP and Uncertainty often have nearly the same numerical value.
MP has unique zero and square proportionality.
In the previous 2 x 2 table
| A | 99 | 1 |
| B | 51 | 49 |
| Total | 150 | 50 |
It is quite common for Uncertainty to nearly equal MP, as it does in this example (MP was .3072). The Uncertainty statistic is discussed extensively by Theil (1972). It has special meaning in communication theory. It also avoids the aforementioned technical objection to MP, since it can be shown that judges faced with the payoff structure described here should rationally state their true subjective probabilities without fudging them. However, as a general measure of association with a simple clear meaning, it seems inferior to lambda, MR, and MP.