We have received 1,419 papers (829 long and 590 short), from which we sent for review 1,318 papers (751 long and 567 short). We describe how these submissions were processed, and give basic statistics about the submissions. While some of these details are mechanistic and inherently boring, they may be helpful to you when preparing your next submission. At the end of the post, we also add some cool numbers about the submission pool.
Area Assignment: At first, all the papers were assigned to the corresponding areas based on the category you selected during submission. As you have probably noticed, categories in the submission page and our areas have 1:1 correspondence, so this step was straightforward. While most of the papers stayed within their original area, we still had to do significant reshuffling of the submissions. Some of this rerouting has to do with the inherent overlap in area definitions — e.g., semantics/information extraction. Therefore, we wanted to make sure that the papers that addressed similar topics were assessed by the same reviewer pool. In other cases, authors clearly selected an incorrect category. Finding the right home for those papers was a must, to best ensure a good match between interested reviewers and the hard work represented in the submissions.
We implemented this process as follows. Area chairs checked all submissions in their areas, one by one, and recommended papers that they felt did not fit in their area. We also double checked these papers, reassigning them to new areas when needed. In most cases (but not always), we followed recommendations of the area chairs. Overall, around 50 papers were moved from the original areas. This number may not sound like much, but it took many hours of work to complete this process (luckily, MIT was closed during a snow storm on Thursday so I had plenty of time to devote to the task).
Rejection without Review: In parallel, we had to do a really unpleasant task of rejecting papers that violated submission extractions. Some of you have seen the post by EMNLP ’16 chairs Reject without review. Even though I read the post and talked with Xavier and Kevin (EMNLP chairs) in person, I did not realize how painful that process could be. Overall, we rejected close to 70 papers. Examples of such violations include submissions with author names, blatant disregard of ACL templates, and length violations. A number of authors changed ACL templates or did not use them altogether. Depending on the degree of change, we exercised our editorial judgment. While we allowed some, is it really worth it to endanger your submission for a few extra lines? The area chairs made the first review of all papers, and we reviewed all candidates for rejection to ensure consistent policy.
On a personal note, it was a heartbreaking process. Some of these papers looked really good. And I do realize that people who forgot to remove their names from the submission did not do it on purpose. I was particularly concerned that around 30% of papers with violations came from regions which are not typically represented in ACL — rejecting their papers without review does not help them to join the community. I am not sure what could be done for future conferences to change this situation. If you have any thoughts, please share them with us or the ACL exec.
Yet another way to get a paper rejected without review was to submit it to other venues, without declaring it as specified in the submission instructions. Other venues here include journals, arXiv, and other conferences. We got some notifications from area chairs, and continue to receive more during the bidding process.
Mis-channelled Creativity: We had a non-trivial group of submissions that tried to be original in the wrong way. Some of them channeled their creativity into unusual titles such as “dummy”, “delete me”, “wrong paper”, etc. I should notice that these creative titles had little overlap with each other, so one can’t eliminate them with a simple script. The authors in this group also had creative names like Sherlock Holmes (when I saw this masterpiece afters hours of moving papers around, my first thought was Moriarty). For some unknown to me reason, we had several submissions of practically the same papers (repeated up to 3 times in some cases) with name permutations or slight rewrites of the text. In case you are one of the authors in this category, I want to emphasize that our reviewing process is not automated. Most likely, at least three people will have to independently spend time examining such submissions, writing emails about them, and processing the rejection. On hindsight, we will be recommending an operating procedure dealing with such problematic submissions — they bloat the submission pool and cause undue work for the PC.
Last Steps Before Bidding: After all the cleaning and removing duplicates, we obtained 751 long and 567 short papers (roughly, 92% of the originally submitted papers). In parallel, Min and all the ACs were finalizing reviewer assignments for the areas. While we had reasonable estimates for individual areas from previous conferences, the actual submission distribution among areas required us to re-adjust reviewers’ assignments. Again, the process was semi-manual, as we had to arbitrate between reviewer preferences and the needs for each area. By 2 am Friday, we were ready to go, and we started the bidding process. There were some wrinkles during the bidding (Min and I are doing one-shot learning on Smart reviewing systems), but now all seems to be running smoothly. Thank you for your patience and for alerting us about all the issues you experienced!
Now for some interesting statistics.
Procrastination Graph: The first one has to do with the submission time. Many of us submit papers at the last moment. How pervasive is this phenomenon? The graphs below show you the number of submissions as a function of time.
Long paper timestamps of first submissions and revisions: (l) from the opening of the long paper portal, and (r) in the last 24 hours before the official deadline.
Short summary — procrastination (or, stating it positively, desire for perfection) is in the blood of our community. Interestingly, about 24 hours before the deadline, Min was worried about the small number of submissions in the portal — only 342 — it seemed likely we had severely over-recruited reviewers! Not to worry, in 24 hours we were on track.
What’s Hot and What’s Not: 2014 vs 2017 Below we demonstrate top 10 areas ranked by the number of submissions for ACL 2014 and ACL 2017. The chart is interesting: even though the two conferences are just three years apart, we can see a clear shift in popularity — e.g., summarization and generation is now in top 5 areas, while in 2014 it didn’t even make top 10. Check the differences and draw your own conclusions!
Finally, we show statistics for each area, separating the counts for short and long papers. I don’t know how to explain the wide variation in ratios in the table below. (Edit: changed table to add Min’s calculated by area submission projections, as 90% — due to joint short/long deadline — of the average between ACL 2016 and 2014. Negative, red figures are thus signs of unanticipated growth in areas).
Let us know if you would like us to compute additional statistics!