The randomness of scientistry

Science is finally turning scientody on scientistry… and the results are not as self-flattering to professional science as most scientists expected.

The NIPS consistency experiment was an amazing, courageous move by the organizers this year to quantify the randomness in the review process. They split the program committee down the middle, effectively forming two independent program committees. Most submitted papers were assigned to a single side, but 10% of submissions (166) were reviewed by both halves of the committee. This let them observe how consistent the two committees were on which papers to accept.  (For fairness, they ultimately accepted any paper that was accepted by either committee.)

The results were revealed this week: of the 166 papers, the two committees disagreed on the fates of 25.3% of them: 42. But this “25%” number is misleading, and most people I’ve talked to have misunderstood it: it actually means that the two committees disagreed more than they agreed on which papers to accept. Let me explain.

The two committees were each tasked with a 22.5% acceptance rate. This would mean choosing about 37 of the 166 papers to accept. Since they disagreed on 42 papers total, this means each committee accepted 21 papers that the other committee rejected and vice versa, for 21 + 21 = 42 total papers with different outcomes. Since they each accepted 37 papers, this means they disagreed on 21/37 ≈ 56% of the list of accepted papers.

In particular, 56% of the papers accepted by the first committee were rejected by the second one and vice versa. In other words, most papers at NIPS would be rejected if one reran the conference review process (with a 95% confidence interval of 40-75%).

What rightly concerns the writer is the fact that a purely random process would have resulted in a 77.5 percent disagreement, which is closer to the 56 percent observed than the 30 percent expected. And, of course, the 0 percent that the science fetishists would have us believe is always the case.

This is a very important experiment, because it highlights the huge gap between science the process (scientody) and science the profession (scientistry). Some may roll their eyes at my insistence on using different words for the different aspects of science, but the observable fact, the scientodically informed fact, is that using the same word to refer to the two very differently reliable aspects of science is incredibly misleading.