Friday, August 1, 2014

Retrospective on SIGMOD 2014

I just finished serving as the PC Chair of SIGMOD 2014 Conference. We implemented a number of changes this year, so I thought it would be worthwhile to document our experiences and my thoughts. The good news is that SIGMOD continues to experiment with the structure of the conference and SIGMOD Executive continues to be very supportive of these efforts. Hopefully, the conference continues to improve as a result.
In addition to the information I provide here, the conference site has more details.

PC Organization

As in previous years, we had a number of area coordinators to assist with the evaluation of submitted papers. The 13 area coordinators (in alphabetical order) were:
  • Lei Chen (Hong Kong University of Science and Technology, China): Graph management, RDF, and social networks
  • Michael Franklin (University of California, Berkeley, USA): Storage, indexing, and physical database design
  • Alfons Kemper (Technical University of Munich, Germany): Query processing and optimization
  • Laks Lakshmanan (University of British Columbia, Canada): Knowledge discovery, clustering, data mining
  • Ioana Manolescu (Inria Saclay, France): Text databases, XML, keyword search
  • Tova Milo (Tel Aviv University, Israel): Database models, uncertainty, schema matching, data integration, crowd sourcing
  • Elke Rundensteiner (Worcester Polytechnic Institute, USA): Streams, sensor networks, complex event processing
  • Ken Salem (University of Waterloo, Canada): System, performance, transaction processing
  • Dennis Shasha (New York University, USA): Anything that does not fit these areas and papers with which area editors have conflict
  • Divesh Srivastava (AT&T Labs-Research, USA): Spatial, temporal, multimedia and scientific databases
  • Kian-Lee Tan (National University of Singapore, Singapore): Aggregation, data warehouses, OLAP, analytics
  • Patrick Valduriez (Inria Sophia-Antipolis Méditerranée, France): Cloud computing, MapReduce parallel/distributed data management, P2P systems
  • Xiaokui Xiao (Nanyang Technological University, Singapore: Security, privacy, authenticated query processing
The program committee consisted of 126 people -- the list is too long to list here; it is on the conference web site.

Review Process

A major change that we introduced this year was to have two submission cycles - the paper submission deadline for the first cycle was September 16, 2013 and for the second one it was December 10, 2013. For each cycle, we allocated 7 weeks for paper reviews, and 10 days for discussions. Within each cycle, some papers were classified as "revise and resubmit" giving the authors  one month to address reviewer comments and submit a revised version. We had about 4 weeks to review these revised papers followed by a week of discussions. The following figure shows the entire process and the associated numbers (the first number in parentheses is for the first cycle while the second number is for the second cycle).

Sigmod14 review
Overall we had 419 submissions out of which we accepted 107 (more on this below). In the first cycle, each PC member was assigned 2-4 papers. When I allocated papers in the second cycle, I took into account the revise-and-resubmit papers that each PC member was handling from the first cycle. I made sure no one was assigned more than 11 papers in this cycle (thanks to Natassa Ailamaki for suggesting this).
A number of important guidelines that we followed:
  • All directly accepted papers were considered conditional accepts. Authors were asked to address reviewer comments and submit a new version within about two weeks. I and the relevant area coordinated quickly reviewed the final versions of the papers. The purpose of this was to ensure that authors did not ignore the usually very useful reviewer comments once the paper is accepted. It was my experience in recent years that a significant number of author treated paper acceptance as final, and totally ignored the reviewer comments that could significantly improve the paper. Given that we treat conference papers as final, archival publications, I wanted to make sure that the papers would be in the best shape possible.
  • As the above figure shows, some papers went into a third round of minor revision. These were usually presentation edits and minor corrections that the reviewers felt were needed to ensure that the paper is SIGMOD quality. Some reviewers even took the time to annotate the paper for presentation fixes that we forwarded to the authors. This was a substantial load  on the reviewers and over what would normally be expected. I very much appreciated the diligence of these reviewers.
  • I asked the reviewers to balance two functions: (1) put together an exciting and broad technical program, and (2) provide meaningful feedback to authors to assist in getting their work published. My point was that as Program Committee members, we do have a "gate keeper" role, but that should be balanced against our responsibility to the community to ensure that worthwhile papers are improved to become publishable. My strong belief is that we do enough good work in this community that should see the light of day with proper guidance
  • As always, I asked the reviewers to provide meaningful reviews. In particular I asked them to refrain from comments like "This has been done before" or "There are not enough experiments". The first comment requires references to be meaningful, while the second one requires explanation of what is needed and why. Almost any paper can have more experiments; the question is whether the paper is acceptable without these additional experiments. Another guideline I provided was to be very careful in declaring that a paper is not suitable for SIGMOD; while we need to make sure that the conference is internally consistent, we don't need to be patronizing to the authors who have selected to submit their works to SIGMOD - we need a balance here.  Finally, I asked them to avoid comments like "I am not excited by this paper" -- if we only accepted papers that I am excited about, we may only have a SIGMOD conference once every couple of years. The bottom line: focus on the technical content of the paper and decide whether it advances our understanding.
  • It is inevitable that there were a number of sub-par reviews -- very short (some even single line) and not very informative. I tried to track the reviews as closely as I could and I asked the owners of these sub-par reviews to improve them. To their credit almost all of them did. However, I did delete a few reviews that were so poor that I decided it was better for the authors not to see them -- I just could not get the reviewers to update them.
  • For revise-and-resubmit papers, I asked the reviewers not to pre-judge what authors can do in the allocated time of one month; I told them to just list what needs to be done for the paper to be acceptable, and leave it to the authors to decide whether or not they can do it in that time. I asked the reviewers to be reasonable in asking what they ask for -- again, each paper can be improved, but we are not looking for perfect papers, we are looking for acceptable papers. 
  • I instructed the reviewers that the evaluation of revised versions should only be based only on the explicitly stated as requirements to the authors. The point of this is that, normally, we should not be raising new issues that the authors do not have a chance to respond. This is a fundamental aspect of journal reviewing and I wanted us to follow the same principle. Of course, if we all of a sudden discover a major flaw in the paper while reviewing the revised paper, we can and should reject the paper, but this did not happen.
Basically, we followed a review process that is quite similar to journal reviewing, We could have improved things considerably (more on that below), but I was generally satisfied with the results. 

Submission and Acceptance Statistics

Research Paper Track

As I noted above, we had 419 submissions to the research paper track of SIGMOD 2014. The paper submissions to SIGMOD were showing slight decline in recent years and we managed to arrest that and bring the submission number close to its traditional value of 425-450. The following figure shows the submission numbers and the acceptance ratios over the recent years.

Sigmod14 submissions

The submissions this year were healthy and manageable. The acceptance rate is in the upper end of what I consider to be the range we should be targeting: 20-25%. Incidentally, Program Committee members asked me repeatedly at the beginning of the process what our quota was and where we were with respect to that quota. My response was that I did not want them to worry about or focus on a quota, that they should simply focus on each paper and decide whether it was acceptable, and that we would find a way fit the accepted papers into a program. Furthermore, since we used a multi-cycle submission process, it was not possible to do very detailed planning anyway. In the end, we accepted 30 papers more than last year and we were able to accommodate all of them by reducing the presentation time to 25 minutes (including the Q&A session).
As always, the distribution of the papers to the 13 areas were not uniform. The following figure shows the distribution of submissions based on the first area that the authors have indicated. We managed this skewed distribution by being flexible in assigning papers to area coordinators each of whom are researchers who could handle more than one area.

SIgmod14 areas

I did not analyze the geographic distribution of the submitted papers, but the distribution of accepted papers were as follows: USA 56, China 11, Switzerland and Singapore 7, Hong Kong 6, Germany 4, India and Japan 3, Israel and UK 2, Australia, Austria, Canada, France, Italy, and Korea 1.
Finally, I looked at some paper-specific statistics. The following figure shows the distribution of the number of authors of the papers as well as the number of countries and institutions represented. These are statistics for accepted papers.

Sigmod14 author dist

It is not surprising that there were no single-authored paper -- there almost never is in SIGMOD. Most of the papers have 3-4 authors. The following table provides the mean and median numbers for these.
Sigmod14 author table

Industrial Paper Track

The Industrial Program track was chaired by Fatma Özcan (IBM Almaden) & Nesime Tatbul (Intel Labs & MIT). They were assisted by 13 PC members. This track received 44 submissions, out of which 15 were accepted, resulting in an acceptance rate of 34%. The accepted industrial papers were also treated as conditional accepts and were shepherded.The distribution of the papers to areas as well as the final decisions are shown in the following figure.

Sigmod14 Industrial

Rounding Out the Technical Program

The Technical Program consisted of 107 research and 15 industrial papers, two keynotes, two panels, and four tutorials. The research and industrial papers were presented in poster sessions over two evenings. This year PODS also included their papers in the poster sessions.
  • Keynotes were selected by Gustavo Alonso. The two keynotes, How I Learned to Stop Worrying and Love Compilers by Eric Sedlar of Oracle Labs and Fun with Hardware Transactional Memory by Maurice Herlihy of Brown University were very outstanding and I heard nothing but good comments.
  • Panel Chairs were Susan Davidson and Sunita Sarawagi. They organized a panel on Should we all be teaching “Intro to Data Science” instead of “Intro to Databases”. In addition, Fatma and Nesime organized an industrial panel on Are We Experiencing a Big Data Bubble?
  • TutorialsChris Jermaine and Yufei Tao, assisted by 8 PC members, selected four tutorials out of 12 submissions.
  • Demonstration Chairs: were Bettina Kemme and Wolfgang Lehner. With the assistance of 53 PC members, they selected 29 demonstrations out of 75 submissions. Demonstrations were grouped into three sessions and each were repeated twice. They also organized the selection of the best demo in each group.
  • Undergraduate Research Program was chaired by Mario Nascimento and Anastasios Kementsietsidis. Out of 18 submissions, they selected 7 for poster presentation.
  • As usual, we had a New Researcher Symposium that was chaired by Alexandra Meliou and Anish Das Sarma.

What Worked and What Would I do Differently

With all the experimentation, I think it is a good idea to document what worked and what I would do differently if I were to do it again. I think the following worked very well:
  • Double-blind is working very well and we should maintain it. There were only two cases where authors wondered how to position the paper without revealing their previous work, and we were able to handle these easily. Our community has now accepted and adjusted to double-blind, and it is working well.
  • Considering accepted papers as conditional accept worked very well -- it added a bit more work for the authors (about two weeks), the area coordinators and I, but the resulting papers were in much better shape.
  • Two submission cycles was a great idea. It gave everyone a chance to get the papers into a more reasonable shape for submission. I am convinced that it played a significant role in the increase of paper submissions. I would actually add a third cycle. However, we should recognize that the process is now spread over a longer period of time.
If I were to do this again, here are some changes I would do:
  • I would give PC members three options for paper decisions: Accept, Reject, Revise-and-resubmit. In the end, we wish to classify the papers into these categories anyway, and having too many categories (Strong Accept-Accept-Weak Accept-Weak Reject-Reject-Strong Reject) is not helpful. PC members do not use the full spectrum anyway; a large majority of the papers are categorized as Weak Accept or Weak Reject, so these papers "in the middle" form a large equivalence set, and we spend a ton of time trying to sort these out. Here are some statistics from the first cycle that demonstrate the point:
    • Number of Strong Accept reviews (out of ~300 reviews): 1
    • Number of Accept reviews (out of ~300 reviews): 21
    • Number of papers with at least one Accept/Strong Accept: 20
    • PC members who have rejected every paper in their batch: 8
  • I would reduce review time from 7 weeks to 5 and increase the discussion time to 4 weeks. I have two reasons for suggesting this:
    • PC members really fall into two categories: those who do their reviews very early in the process, and those who procrastinate forever. In the second cycle, only 60% of the reviews were submitted one week before the deadline. In case  you think that people were doing their reviews and were uploading at the last minute, I would note that only 80% of the reviews were in when the deadline passed. It took over a week into the discussion period (and many emails) for us to get all the reviews. It appears to me that shortening the period will not have a major impact on the behaviour of either of these groups.
    • Extended discussion period is useful not only for the reviewers to have fuller discussions, but, more importantly, it gives the PC chair more time to go over the reviews and address deficiencies. I tried to read as many of the reviews as time permitted and asked colleagues to improve their reviews. I also participated in the discussions on some papers. However, time was an issue and a longer discussion period would have allowed me to be more engaged.
  • One thing that is not working well is online discussions. Some PC members consider their job done when they submit their reviews and no amount of encouragement would get them to participate in the discussion. I am not sure what to do about this, but it is an issue that we need to address. Right now, online discussions are not doing the job. Perhaps it would be a good idea to have a face-to-face meeting of the area coordinators; that would be an improvement.

In the end...

it was a very enjoyable experience. I have now served as PC Chair of all three major database conferences (VLDB in 2004, ICDE in 2007, and SIGMOD in 2014), and each one is very different. I tried something new in each of these, and some ideas were worthwhile while others were improved upon by others.

7 comments:

  1. "Double-blind is working very well and we should maintain it."

    As a leading research in a very data-driven domain, can you provide some data and statistics to back this claim up?

    ReplyDelete
    Replies
    1. I would be interested to hear what kind of statistics and data you would like to see. My comment is mainly in response to the often heard claim that double blind causes undue difficulties -- that was not the case this year. The other issue is with respect to distribution of papers to institutions and countries -- that is significantly improved with the introduction of double blind. A few years ago I had conducted (albeit unscientific) study of the distribution for the five years immediately preceding the introduction of double blind and the five years immediately following it, and the results demonstrated the variability. There is an earlier study by Anthony Tung (http://dl.acm.org/citation.cfm?doid=1168092.1168093) that confirms the same.

      Delete
    2. Given the well-documented effects of individual bias induced by a person's name alone, as well as bias in peer-review (see: http://advance.cornell.edu/documents/nepotism_and_sexism.pdf, http://www.pnas.org/content/early/2012/09/14/1211286109,
      and http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2063742 to list but a few points in a very large body of research), you ought to need a really good reason for not using double blind.

      Delete
    3. I agree Sara. That probably is a contributing factor to the changes in the distribution of papers that we observe with double blind.

      Delete
  2. One statistic I'd like to see, is the quality of matching papers to reviewers. As a simple measure; how many reviews were done by reviewers who bid as "eager" to do the review, compared to the number who said they'd do it "in a pinch" or even "unwilling to review".

    ReplyDelete
    Replies
    1. Part of it is easy -- no one who said "unwilling to review" got the paper assigned to the, and almost everyone who indicated they were "eager" to review a paper got them. However, the challenge is that there are not too many in "eager" category, so you struggle when you do assignments. For example, I tried a more even distribution of papers among reviewers by setting the minimum and maximum assignments close to each other, but CMT was not able to find a satisfiable allocation. Then you start increasing the maximum that can be assigned and eventually you find a satisfiable solution. Another difficulty is that many reviewers simply indicate what they would be willing to review (in various categories) and what they would definitely not be willing to review, but leave the others in their default settings. The allocation algorithm treats these as "I don't care" and not "I don't want to", and reviewers are sometimes surprised when they see these papers assigned to them. I did do some reallocation manually to deal with a few of these issues.

      Delete