This post is dedicated to undeclared double submissions. It is a common practice for PC chairs in related areas (e.g., NLP, ML and AI) to exchange submission abstracts to identify overlapping papers. I thought that this fact is generally known to everyone who submits to ACL, but apparently it is not. A week ago, we got a message from ICML’17 chairs asking us to compare the submissions. They ran a script to compare ACL and ICML papers, and found 3 undeclared double submissions. Min verified by hand that the papers are indeed very close to each other, and it was clear we have to reject them. But the trouble didn’t end with ICML. After the IJCAI deadline, we were alerted by IJCAI reviewers who happened to be ACL reviewers about very close submissions. We ran the script — lo and behold — 25 long submission and 7 short submissions were flagged as similar. Now, it was my turn to check these submissions. (Three hours of my Saturday time were dedicated to this tedious task. But with multiple cups of black tea and Sia in the background, the initial mission was accomplished. We are still not done with the final validation, but already exchanged 50+ emails between IJCAI and ML PC chairs, ACs, etc.) Indeed, 10 papers were undeclared double submissions on the spot, and for another 10, I solicited the second and third opinion. While some papers were literally identical, other cases were less straightforward. For instance, some submissions primarily varied at the author level — permuting it or just removing a few authors from the ACL version when submitting it to the later IJCAI deadline (I am serious here, this is not a joke.). More commonly, authors focused on generating reworded versions of the content. This material creates an excellent data for training paraphrasing models, or maybe it can be even added to SNLI entailment corpus. But this where the benefits of this effort end. Interestingly, none of the authors declared double submissions to IJCAI/ICML in their title page, but some noted their arXiv submissions.
Even if one manages to publish in both places without being caught, the damages for the authors are long lasting. For instance, last week one of the community members brought to our attention a paper published in ACL’16 which has a very strong overlap with a paper published in another conference. Of course, we can’t do anything about it as PC chairs for ACL’17. But it definitely colors my own perception of these authors and their other papers.
Currently, it is the job of the PC chairs to weed out these double submissions and avoid cases as described above. We brought up these issues with Joakim Nivre (the ACL president) and he informed us that the discussion about double submissions is on-going in the ACL exec. As usual with complex questions, there is no clear answer. Currently, the worst that can happen to such authors is to have their papers quietly rejected. One option is to continue with this policy. It is particularly appropriate for authors who are not familiar with the ACL double submission requirements and for those that made an honest mistake. But this lenient policy makes the PC responsible for eliminating double submissions, taking time away from legitimate submissions. Thus, another option is to consider more severe penalties, in particular for repeated offenders. My personal recommendation is to adopt principle of natural selection — let these papers be published. Authors who do it damage their own reputation, and it is not our job to babysit them. We should not be spending our time policing, but instead focus on selecting exciting papers and providing quality feedback for all the authors.
If you have any thoughts on the subject, please do share them with us. (As I was writing this blog, I got an email from Kristina Toutanova about checking ACL against TACL submissions. It never ends … )