Thursday, February 24, 2011

Call for balance in computer science reviewing: Evaluation and ideas papers are both important

I was sparked to create this post by reading a tweet that led me to this Letter to PC Chairs. In it, a group of eminent computer scientists express how they want the field "to become a better empirical science by promoting both observation and reproduction studies."

I applaud this initiative. Much computer science research needs better empirical validation.

However I have some criticisms of the open letter as it stands. Firstly, the push to improve empirical validation has been around for my entire career, especially in software engineering. The cited call to action suggests that what they are seeking is something new.

My main criticism, however, is that I have seen the pendulum swing too far the other way. In some conferences it is now common to have papers with good ideas rejected because they do not have a rigorous empirical evaluation. This causes just as many problems.

From my perspective, we are rejecting far too many papers of all kinds in computer science conferences. If a paper has a rigorous evaluation, or is a well-executed replication of an existing study, it should be published. If a paper has good ideas with thoughtful analysis, it should also be published, even if it doesn't have much of an evaluation. And if a paper has a healthy mix of these, in other words, an incremental idea idea with a moderate evaluation, then it also deserves to be published.

We shouldn't be rejecting a paper purely because "it has insufficient evaluation" as long it contains an interesting or novel idea. And we shouldn't be rejecting papers that are "pure empirical studies" or "mere replications". We need to be balanced.

Many of the good computer science conferences have very low acceptance rates (between 12 and 30 percent) and remain fairly fixed in their attendance numbers over the years. In fact, there is often stagnation, where a high proportion of conference attendees are graduate students presenting papers. Their supervisors and many other people in the field don't go because they have nothing to present. This is not going to promote debate and development of the field.

I agree that papers should be rejected if they are badly written, have too much wrong reasoning, have bad statistical analysis, have ideas that are "half-baked", don't say anything new, etc. But  more papers with either decent ideas and/or empirical studies should be accepted, such that the conferences grow in size over time.

The authors of the "Letter to PC Chairs" point out that fields such as biology and medicine have a tradition of rigorous empirical evaluation. True. We can certainly learn from them. But there are also papers in these fields that are case studies, that express new ideas, or that simply describe a single sample of a new species, syndrome or medical procedure.

I have published many papers with empirical evaluation. I actually find that it is the "ideas" papers that are harder to get published. I would like to have discussion of the ideas at a conference before I embark on the years-long process of performing rigorous experiments.

I have probably been guilty of rejecting too many papers as a program committee member. My tendency is to want to be "fair and consistent" with how other papers will be treated, and if papers are being rejected for insufficient empirical evaluation, I tend to do the same. I have therefore contributed to some ideas papers being rejected that perhaps should have been accepted. I think they would have been accepted if there had been a carefully written set of crieria sent to all reviewers that encourages acceptance of a wide variety of types of paper.

A few years ago I co-authored the criteria for CSEE&T that describe the kinds of papers that should be acceptable in that conference. I actually think those criteria may have erred on the side of demanding too much rigorous empirical evaluation, at the expense of interesting, ideas-oriented papers.


  1. I have added a new post related to this. See

  2. While writing the "letter to PC chairs" we (primarily the organizers of the Evaluate 2010 workshop) had many discussions on what to include and what to leave out. In the end, we focused the letter on encouraging observational studies and reproduction studies, because we believe that we have a higher chance of improving the situation by focusing on a smaller set of issues.

    I personally think that there truly is a lack of observational studies and reproduction studies at least in my main area of interest -- even though I have to admit that I haven't (yet) done an empirical study that would show this.

    That said, I believe the authors of the letter to PC chairs completely agree with the importance of idea papers, that we all write idea papers, and that it is not at all our goal to reduce the number of idea papers.

    Finally, I'd like to attach a pointer to our upcoming Evaluate 2011 workshop (at PLDI'11), where we will work towards a paper with the tentative title "Evaluation Anti-Patterns: A Guide to Bad Experimental Computer Science", which will focus on methodological fallacies and pitfalls (especially those the paper's authors have made in their own prior work). The workshop's CFP can be found at