NB — The numbers might not always work out here, there are missing data from the analyses due to conflicts.
This blog post covers the outcomes of the first round of reviewing for the CHI 2024 Papers track. Last week, authors were invited to revise and resubmit when at least one associate committee member (AC) recommended Revise and Resubmit or better. This short post provides analyses that might help to contextualise review recommendations across submissions. This post focuses on outcomes; we will be reflecting further on the review process and the huge contributions made by associate chairs (ACs) and reviewers in a future post.
Review scales
Before we go into the outcomes, a reminder of the scales that have been used during the CHI 2024 review process. Reviewers and ACs provide a recommendation (recommendation category out of 5 choices) and can further contextualise their recommendation based on originality, significance, and research quality (each a 5 point ordinal scale from Very Low, Low, Medium, High and Very High).
Short Name | On Review Form | Threshold for Revise and Resubmit |
---|---|---|
A | I recommend Accept with Minor Revisions | Yes |
ARR | I can go with either Accept with Minor Revisions or Revise and Resubmit | Yes |
RR | I recommend Revise and Resubmit | Yes |
RRX | I can go with either Reject or Revise and Resubmit | No |
X | I recommend Reject | No |
Outcomes after the first round of reviews
CHI 2024 Papers received 4028 complete submissions. Four papers were withdrawn and 207 papers (5%) were desk rejected before going out to review. 1651 (41%) received at least one ‘RR’ (or better) recommendation from 1AC or 2AC and were put forward to round 2. 2166 papers (54%) did not receive at least one ‘RR’ (or better) recommendation from 1AC or 2AC, and were rejected as they did not meet the threshold for entering the next round.
Of the papers moving to the next stage of review, the decision process of 1586 submissions could be further analysed (the others were redacted from the raw data due to Analytics Chair conflicts). Of these submissions, 1095 (69%) proceeded to revise and resubmit because both ACs made a recommendation of RR or better (i.e., RR, ARR or A). Three hundred and sixty (23%) proceeded to revise and resubmit because the 1AC gave a recommendation of RR, but the 2AC gave an RRX or X recommendation (i.e., the 1AC recommendation carried the submission to revise and resubmit). In 131 cases (8%), the 1AC recommended RRX or X, but the 2AC recommended RR or better (i.e., the 2AC recommendation carried the submission into revise and resubmit).
Looking at the review recommendations more widely across 3,678 Reject and RR submissions (NB – Analytics Chair conflicts account for the ‘missing’ 139 submissions), we can identify forty submissions (1%) proceeding to RR with three or four A (accept) recommendations from ACs and reviewers. Two hundred and fifty submissions (6.8%) were rejected with four X recommendations.
Progression from RR to final decisions
It is not possible to directly relate RR/Reject decisions to “average” scores – only one AC needs to recommend RR and submissions go to revise and resubmit (79 submissions have gone through to revise and resubmit with one RR AC recommendation and three others at X or RRX). However, we can still observe that RRX (revise and resubmit/reject) was the review recommendation that authors were most likely to see amongst their reviews (2,599 submissions received one or more RRX). As you might expect, authors were least likely to find an A (accept) recommendation amongst their reviews (only 416 submissions received one or more A).
Even with reviews in hand, it’s useful to know where a given submission sits compared to others. One way to look at this is by calculating the proportion of “supportive” reviews for a given submission. This support is inferred from the actual recommendation of the reviewer, not the text of the review. Figure 1 shows the proportion of reviews recommending RR or better (i.e., RR, ARR, A) for each submission.
Figure 1: How many supportive (A, ARR, RR) reviews does each submission have? With four reviews per submission, it’s somewhere between zero and four.
We can also break the recommendations down in the same way, but using A and ARR as indicators of support (rather than A, ARR and RR). Last year, 49% of submissions were invited to submit revisions. Of those submissions progressing to RR, 59% were subsequently accepted and 41% rejected. Considering the proportion of revised submissions accepted at CHI 2023, and imagining this measure of support translates directly into acceptance (caveats abound!), then it seems less likely for submissions that have no A/ARR reviews to be accepted.
‘Support’ (A+ARR) | Reject | Revise and Resubmit | |||
---|---|---|---|---|---|
# | % | # | % | ||
Zero | 1986 | 91.7% | 588 | 35.6% | |
One | 174 | 8% | 533 | 33.5% | |
Two | 6 | 0.3% | 262 | 15.9% | |
Three | 0 | 0% | 133 | 8.1% | |
Four | 0 | 0% | 115 | 7.0% |
Sticking with the CHI 2023 data can also help give an idea of how things pan out in the end. We don’t have the data from the first round, so this plot is based on the final review recommendations after a second round of review. Reviewers could still choose A, ARR, RR, RRX, or X in round 2, even if only two decisions were meaningful: Reject in second round and Accept in second round. But it still gives an idea of where reviews might need to end up after round 2 for a submission to be accepted; an overwhelming majority of the submissions accepted after the round 2 (92%) had four recommendations of RR or better. Figure 2 shows this distribution.
Figure 2: At CHI 2023, the vast majority of submissions that were accepted ended the Round 1 review process with four ‘supportive’ reviews of A, AAR or RR.
Whether you’re contemplating your revisions, or your manuscript has been rejected and you’re thinking about where to send your work next, spare a thought for the authors of 89 submissions who received a different recommendation for each of their four reviewers (8 were rejected, 81 advance to revise and resubmit).
Outcomes as a function of submission type
This year marks the resurrection of different categories of submissions under the Papers track. CHI in previous years solicited “Notes” – short submissions of up to four-pages plus references. Notes were last solicited at CHI 2017. This year, the call for papers identified three categories of submission: Short, Standard, and Excessively Long. Short submissions, of 5000 words or less, were “strongly encouraged”. Although not a facsimile of the old Note format, the call for Short submissions was intended to create a similar space for shorter pieces that perhaps make smaller contributions.
The table below shows a breakdown of submissions by type. It shows how many papers of each kind were submitted, and what the outcomes were for those papers. To keep things simple, we just focus on papers that went to full review and had either a revise and resubmit outcome or a reject outcome. We will ignore desk rejects and withdrawn papers.
Submission type | Total submissions | # moving to RR | # rejected | % moving to RR |
---|---|---|---|---|
Excessively long | 58 | 27 | 31 | 46.6% |
Short | 490 | 110 | 380 | 22% |
Standard | 3269 | 1514 | 1755 | 46.3% |
You can see that Excessively Long
and Standard
submissions move to RR at a similar rate. Short
submissions are about half as likely as other submissions to move to RR. This pattern is similar to what we have seen in the past with Notes – acceptance rates are much lower than for ‘full’ papers.
Outcomes as a function of subcommittee
Our CHI 2024 blog post on submissions received broke down submission rates by subcommittee. How has variation across subcommittees fed into submission outcomes? To keep things straightforward again, let’s just consider submissions with RR
and Reject
decisions. These make up 95% of the decisions made and make plots more readable. Figure 3 shows that the User Experience subcommittee passed the lowest proportion of papers to RR (36%). The Critical and Sustainable Computing subcommittee passed the highest proportion of papers to RR (52%).
Figure 3: What proportion of submissions to a given subcommittee have progressed to revise and resubmit?
In our CHI 2024 blog post on submissions received, we also showed the relative size of subcommittees in terms of the number of submissions that they received. Given the differences in the RR rates of the different subcommittees, the relative size of the subcommittees in terms of the number of papers under review has changed. Figure 4 shows that the User Experience subcommittee was biggest by number of submissions, but is now sixth biggest based on the number of submissions that remain under consideration.
Figure 4: How many submissions in each subcommittee are still part of the CHI 2024 review process (and how many were submitted)?
Note that the final acceptance rate for each subcommittee will also vary. It is not necessarily the case that the rank order presented below will follow precisely for acceptance rates. These data from CHI 2023 show the difference between the RR rate (i.e., the proportion of non-desk rejected submissions invited to revise and resubmit) and the Accept rate (i.e., after RR). The difference between these two rates is as much as twelve percentage points. (Unsurprisingly, RR still very strongly correlates with Accept, r(16)=0.86, p<.001; higher RR rate yields higher Accept rate and vice versa.)
Subcommittee | RR rate | Accept rate | Percentage points difference between RR and Accept |
---|---|---|---|
CompInt | 54% | 28% | -28% |
Critical | 64% | 34% | -28% |
Devices | 66% | 42% | -24% |
PeopleQual | 56% | 34% | -24% |
Games | 48% | 28% | -22% |
Privacy | 54% | 32% | -22% |
Systems | 52% | 28% | -22% |
Access | 56% | 36% | -20% |
Design | 50% | 28% | -20% |
Ibti | 56% | 36% | -20% |
IntTech | 52% | 32% | -20% |
Learning | 44% | 24% | -20% |
PeopleMixed | 44% | 24% | -20% |
PeopleQuant | 50% | 30% | -20% |
Viz | 58% | 38% | -20% |
Health | 48% | 30% | -18% |
UX | 42% | 24% | -18% |
Apps | 46% | 32% | -16% |
Bonus Chartjunk
In our CHI 2024 blog post on submissions received, we analysed new data that has only become available in PCS, the conference submission system, in this cycle. We considered when authors created their submissions for the Papers track, and when they made their final edits before the deadline. The general theme of the analyses was that authors do things at the last minute. As reviewers are largely drawn from largely the same pool as authors, are reviews also submitted at the last minute? It looks like reviewers generally accept reviews quickly (Figure 5)… and finish them at the last moment (Figure 6) – the deadline was the 25th October! (NB – we’ve taken the first complete submission of a review as the signal for completion, but many reviewers update their reviews several times before, during and after the discussion period.)
Figure 5: How quickly do reviewers accept invitations to review? Quite quickly!
Figure 6: How quickly do reviewers return their reviews? At the last minute!
Datatables
It is not possible to produce useful datatables for all analyses, many of which require access to individual submission and review data that cannot be easily or safely shared. However, in the interests of transparency we are trying to make the data that support the analyses available where we can.
The data on the proportions of recommendations made by reviewers across individual reviews (not submissions!) is given by:
Recommendation | n |
---|---|
A | 556 |
ARR | 1481 |
RR | 3620 |
RRX | 5042 |
X | 4044 |
Figure 1 was plotted using this summary data:
Positivity | Decision | n |
---|---|---|
0 | RR | 0 |
0 | Reject | 1459 |
0.25 | RR | 86 |
0.25 | Reject | 534 |
0.5 | RR | 259 |
0.5 | Reject | 70 |
0.75 | RR | 599 |
0.75 | Reject | 1 |
1 | RR | 626 |
1 | Reject | 0 |
Figure 2 was plotted using this summary data:
Positivity | Decision | n |
---|---|---|
0 | Accept | 0 |
0 | Reject | 111 |
0.25 | Accept | 0 |
0.25 | Reject | 176 |
0.5 | Accept | 6 |
0.5 | Reject | 107 |
0.75 | Accept | 58 |
0.75 | Reject | 43 |
1 | Accept | 722 |
1 | Reject | 15 |
Figure 3 and 4 were plotted using this data:
Subcommittee | Submissions total | Submissions to RR | RR Rate |
---|---|---|---|
Access | 267 | 136 | 51% |
Apps | 200 | 90 | 45% |
CompInt | 236 | 90 | 38% |
Critical | 183 | 96 | 52% |
Design | 252 | 103 | 41% |
Devices | 122 | 57 | 47% |
Games | 138 | 53 | 38% |
Health | 264 | 107 | 41% |
Ibti | 161 | 73 | 45% |
IntTech | 271 | 107 | 39% |
Learning | 212 | 88 | 42% |
PeopleMixed | 198 | 80 | 40% |
PeopleQual | 233 | 95 | 41% |
PeopleStat | 216 | 92 | 43% |
Privacy | 185 | 94 | 51% |
Systems | 259 | 125 | 48% |
UX | 270 | 98 | 36% |
Viz | 150 | 67 | 45% |
The ‘bonus chartjunk’ plots are histograms produced from PCS logs, rather than a summary table, and so it is not possible to share this data.