Avoidable Trouble: undeclared double submissions

Dear readers,

This post is dedicated to undeclared double submissions. It is a common practice for PC chairs in related areas (e.g., NLP, ML and AI)  to exchange submission abstracts to identify overlapping papers. I thought that this fact is generally known to everyone who submits to ACL, but apparently it is not. A week ago, we got a message from ICML’17 chairs asking us to compare the submissions. They ran a script to compare ACL and ICML papers, and found 3 undeclared double submissions. Min verified by hand that the papers are indeed very close to each other, and it was clear we have to reject them. But the trouble didn’t end with ICML. After the IJCAI deadline, we were alerted by IJCAI reviewers who happened to be ACL reviewers about very close submissions. We ran the script — lo and behold — 25 long submission and 7 short submissions were flagged as similar.  Now, it was my turn to   check these submissions. (Three hours of my Saturday time were dedicated to this tedious task. But with multiple cups of black tea and Sia in the background, the initial mission was accomplished.  We are still not done with the final validation,  but already exchanged 50+ emails between IJCAI and ML PC chairs, ACs, etc.)  Indeed,  10 papers were undeclared double submissions on the spot, and for another 10, I solicited the second and third opinion.  While some papers were literally identical,  other cases were less straightforward.  For instance,  some  submissions primarily varied at the author level — permuting it or  just removing a few  authors from the ACL version when submitting it to the later IJCAI deadline (I am serious here, this is not a joke.). More commonly,  authors focused on generating reworded versions of the content.   This material creates an excellent data for training paraphrasing models, or maybe it can be even added to SNLI entailment corpus. But this where the benefits of this effort end.   Interestingly, none of the authors declared double submissions to IJCAI/ICML in their title page, but some noted their arXiv submissions.

Even if one manages to publish in both places without being caught, the damages  for the authors are long lasting. For instance, last week one of the community members brought to our attention a paper published in ACL’16 which has a very strong overlap with a paper published in another conference. Of course, we can’t do anything about it as PC chairs for ACL’17. But it definitely colors my own perception of these authors and their other papers.

Currently, it is the job of the PC chairs to weed out these double submissions and avoid cases as described above. We brought up these issues with Joakim Nivre (the ACL president) and he informed us that the discussion about double submissions is on-going in the ACL exec.  As usual with complex questions, there is no clear answer.  Currently,  the worst that can happen to such authors is to have their papers quietly rejected.  One option is to continue with this policy. It is particularly appropriate for authors who are not familiar with the ACL double submission requirements and for those that made an honest mistake. But this lenient policy makes the PC responsible for eliminating double submissions, taking time away from legitimate submissions. Thus, another option is to consider more severe penalties, in particular for repeated offenders.  My personal recommendation  is to adopt principle of natural selection — let these papers be published. Authors who do it damage their own reputation, and it is not our job to babysit them. We should not be spending our time policing, but instead focus on selecting exciting papers and providing quality feedback for all the authors.

If you have any thoughts on the subject, please do share them with us. (As I was writing this blog, I got an email from Kristina Toutanova about checking ACL against TACL submissions. It never ends …  )

9 thoughts on “Avoidable Trouble: undeclared double submissions

  1. I believe that when detected, the double submission should be rejected, as it is in clear violation of the rules of submission. Just like formatting errors trigger a reject, this should as well, particularly since it can actually be perceived as a worse offense. The policy of “let them suffer the consequences” might instead damage the community as a whole, since the originality of the papers published in one or the other venue (assuming it is accepted and published in both, and not withdrawn from one by the authors) would come into question. The double submission/publication will often not be evident and thus will not really come to “haunt” the offender.

    Like

  2. How about the ACL double submission workshop policy that states that papers presented at ACL workshop doesn’t count as having been previously published if an extension is submitted to a main conf (workshop -> xCL)?
    Is it still endorsed? Are reviewers aware of this rule? If papers were rejected in contradiction with that rule, will they be re-evaluated?

    “Papers that have appeared at a workshop do not constitute previously published work, as long as the paper submitted to ACL is an extension of the workshop paper.”
    http://aclweb.org/adminwiki/index.php?title=Double_Submission_Policy_for_Conferences

    Best,
    Djamé
    ps:

    Like

    1. HI! All the rejections were handled by both Min and me. We are aware of the rules related to workshops. Currently, most of the rejections result from double submissions to IJCAI and ICML.

      Liked by 1 person

  3. We got a false positive on the ACL-IJCAI comparison: same model, different applications. Since the model is unpublished it had to described in detail in both papers, hence the overlap.

    But this raises an issue I’ve struggled with, both as an author and as a reviewer: if you use a method that has been published but is not familiar to reviewers, then some reviewers will want it described in some detail (a one-page description can save them the need to read a full paper or two), whereas other reviewers will be unhappy that the paper repeats material that has been published previously. So in describing previous work, the author is left looking for the Goldilocks solution (not too hot, not too cold), and the right place is different for each reviewer.

    Like

    1. I only check whether it is reasonable for the authors to present published work with a few sentences or one whole page. If it is a baseline in the experiment section, a few sentence is enough. In some situations, if the authors use the model as an upstream module and theoretially assume that the upstream output is fully trusted, I would expect to see error examples triggered by the upstream mistakes in the experiments. To be frank, I don’t think “wow, it saves my time” is a qualified attitude. It is our basic job and responsibility to read more — every one of us is busy and that is what we are mainly busy with. For example, I knew nothing about the Goldilocks solution and I checked Wikipedia and acquire the knowledge about the original fairy tale and modern applications. It expands my knowledge and it’s fun.

      Writing a page-long description is also acceptable. If the authors want to highlight how their work benifis from some detailed approaches in the previously published models, they have to show us the whole pipeline. Simply being unhappy without delving into the details is defenitely unqualified.

      Like

      1. It’s not just saving time for a reviewer — it’s also helping the readers. If a reader needs to consult several other works just to understand a paper, they may just give up. So if a paper uses a somewhat obscure model, it makes sense to describe it in more detail than a well known model, even if the details have been published elsewhere. However, opinions vary regarding how much more detail is appropriate, and what counts as helpful exposition as opposed to repetition of prior art.

        Like

    2. Let’s adopt principle of natural selection here again: a reader gives up after reading a few papers, the opportunity of potential creative work just sneaks away from the reader and heads for other patient readers.
      Of course, I agree that opinion varies and my stance is more likely on a patient case-by-case analysis.

      Like

  4. In a comment on an earlier post I made a suggestion that would solve part of the double submission problem, at least among *ACL papers: eliminate all separate conference reviewing, and move all *ACL papers to TACL.

    This would also help somewhat with double submissions with non-ACL conferences: since TACL submission is year-round, authors won’t need to worry about concurrent deadlines and overlapping review periods between *ACL and other conferences.

    Like

    1. Most other conferences (at least in ML/AI) expressly forbid double submission. Have we considered doing this, instead of just asking authors to declare this in a cover page? I strong wording here might help to eliminate some of the multiple submission cases.

      Bridging into the previous blog post:

      I’m skeptical about the utility of declaring that the paper is on arXiv, and being punitive towards people you have violated this policy. There’s no concrete guidelines about how reviewers should react to this information, for instance: 1) should arXiv papers be treated as ‘real’ papers when presenting comparative results, or excluded from formal comparisons; and 2) knowing this information might incentivise some reviewers to look at the arXiv version, and perhaps compare the two versions, look at the revision history etc. This of course flies in the face of the whole double-blind requirement. Personally I would avoid searching for the paper, although I doubt that all reviewers would do the same. Related, searching for the paper is a critical tool in order to see if the paper, or a very similar one, is already published elsewhere.

      Jason and other’s idea of having anonymous arXiv papers sounds like a great idea at face value, but I doubt would be practical. Given there’s no stamp of formal approval on arXiv papers (i.e., acceptance at a top tier venue), the author names and affiliation are the main signals of quality. Without this information, I doubt many papers would end up being read at all.

      I personally find arXiv incredibly useful, and banning its use would annoy a lot of people (myself included), and for little gain. Bear in mind that banning arXiv would exclude resubmitting failed ICLR papers, as well as slowing down the pace of research in some ‘hot’ areas by a few months. Before making a rash decision we could try and be more scientific about the problem, by running some experiments on our own conference, to test if:

      1) Does knowing that the paper is on arXiv affect the likelihood of the paper being accepted? In this case, we observe the effects (top author, top lab etc) like shown to occur in WSDM? There’s a big gap between the WSDM setting of explicit naming of authors and our treatment of arXiv.
      2) We don’t have blind ACs or PCs, so shouldn’t we seek to test this aspect of our reviewing model? I suspect there’s a skew in terms of borderline decisions (or soliciting extra reviews) made by the chairs in favour of established names.

      Is there an appetite to do a controlled trial to test the above? Before banning anything, I think we need some more convincing evidence.

      More controversially (playing devil’s advocate here): Does the fact that there is a difference in the acceptance decisions based on whether the authors are known or not point to a problem in the system? Assuming reviews are done in a hurry (especially so, if given only 2 weeks) and thus are noisy and imperfect in various ways, then when it makes a lot of sense for reviewers as Bayesian agents to incorporate a prior (from authorship) when making their recommendations. Perhaps there’s more trust that they will really do the things promised in author feedback, or benefit of the doubt is given regarding some confusions in the paper. To better answer this question, we could correlate TACL-style reviews / much larger pool of blind reviews, versus ACL reviews (with/without knowledge or authors/arXiv). Maybe the effect seen in WSDM would be weakened enough to be acceptable. I don’t have a strong feeling on this one, but it’s probably better to test this, than rely on such a noisy signal.

      Like

Comments are closed.