CHI 2024 – Papers track, post-round one outcomes report

NB — The numbers might not always work out here, there are missing data from the analyses due to conflicts.

This blog post covers the outcomes of the first round of reviewing for the CHI 2024 Papers track. Last week, authors were invited to revise and resubmit when at least one associate committee member (AC) recommended Revise and Resubmit or better. This short post provides analyses that might help to contextualise review recommendations across submissions. This post focuses on outcomes; we will be reflecting further on the review process and the huge contributions made by associate chairs (ACs) and reviewers in a future post.

Review scales

Before we go into the outcomes, a reminder of the scales that have been used during the CHI 2024 review process. Reviewers and ACs provide a recommendation (recommendation category out of 5 choices) and can further contextualise their recommendation based on originality, significance, and research quality (each a 5 point ordinal scale from Very Low, Low, Medium, High and Very High).

Short Name	On Review Form	Threshold for Revise and Resubmit
A	I recommend Accept with Minor Revisions	Yes
ARR	I can go with either Accept with Minor Revisions or Revise and Resubmit	Yes
RR	I recommend Revise and Resubmit	Yes
RRX	I can go with either Reject or Revise and Resubmit	No
X	I recommend Reject	No

Outcomes after the first round of reviews

CHI 2024 Papers received 4028 complete submissions. Four papers were withdrawn and 207 papers (5%) were desk rejected before going out to review. 1651 (41%) received at least one ‘RR’ (or better) recommendation from 1AC or 2AC and were put forward to round 2. 2166 papers (54%) did not receive at least one ‘RR’ (or better) recommendation from 1AC or 2AC, and were rejected as they did not meet the threshold for entering the next round.

Of the papers moving to the next stage of review, the decision process of 1586 submissions could be further analysed (the others were redacted from the raw data due to Analytics Chair conflicts). Of these submissions, 1095 (69%) proceeded to revise and resubmit because both ACs made a recommendation of RR or better (i.e., RR, ARR or A). Three hundred and sixty (23%) proceeded to revise and resubmit because the 1AC gave a recommendation of RR, but the 2AC gave an RRX or X recommendation (i.e., the 1AC recommendation carried the submission to revise and resubmit). In 131 cases (8%), the 1AC recommended RRX or X, but the 2AC recommended RR or better (i.e., the 2AC recommendation carried the submission into revise and resubmit).

Looking at the review recommendations more widely across 3,678 Reject and RR submissions (NB – Analytics Chair conflicts account for the ‘missing’ 139 submissions), we can identify forty submissions (1%) proceeding to RR with three or four A (accept) recommendations from ACs and reviewers. Two hundred and fifty submissions (6.8%) were rejected with four X recommendations.

Progression from RR to final decisions

It is not possible to directly relate RR/Reject decisions to “average” scores – only one AC needs to recommend RR and submissions go to revise and resubmit (79 submissions have gone through to revise and resubmit with one RR AC recommendation and three others at X or RRX). However, we can still observe that RRX (revise and resubmit/reject) was the review recommendation that authors were most likely to see amongst their reviews (2,599 submissions received one or more RRX). As you might expect, authors were least likely to find an A (accept) recommendation amongst their reviews (only 416 submissions received one or more A).

Even with reviews in hand, it’s useful to know where a given submission sits compared to others. One way to look at this is by calculating the proportion of “supportive” reviews for a given submission. This support is inferred from the actual recommendation of the reviewer, not the text of the review. Figure 1 shows the proportion of reviews recommending RR or better (i.e., RR, ARR, A) for each submission.

A bar chart showing the proportion of supportive (i.e., RR, ARR, A) recommendations across submissions. The y-axis is a count of submissions, the x-axis is 0-4, depending on how many supportive reviews a given submission has had.

Figure 1: How many supportive (A, ARR, RR) reviews does each submission have? With four reviews per submission, it’s somewhere between zero and four.

We can also break the recommendations down in the same way, but using A and ARR as indicators of support (rather than A, ARR and RR). Last year, 49% of submissions were invited to submit revisions. Of those submissions progressing to RR, 59% were subsequently accepted and 41% rejected. Considering the proportion of revised submissions accepted at CHI 2023, and imagining this measure of support translates directly into acceptance (caveats abound!), then it seems less likely for submissions that have no A/ARR reviews to be accepted.

‘Support’ (A+ARR)	Reject		Revise and Resubmit
	#	%	#	%
Zero	1986	91.7%	588	35.6%
One	174	8%	533	33.5%
Two	6	0.3%	262	15.9%
Three	0	0%	133	8.1%
Four	0	0%	115	7.0%

Sticking with the CHI 2023 data can also help give an idea of how things pan out in the end. We don’t have the data from the first round, so this plot is based on the final review recommendations after a second round of review. Reviewers could still choose A, ARR, RR, RRX, or X in round 2, even if only two decisions were meaningful: Reject in second round and Accept in second round. But it still gives an idea of where reviews might need to end up after round 2 for a submission to be accepted; an overwhelming majority of the submissions accepted after the round 2 (92%) had four recommendations of RR or better. Figure 2 shows this distribution.

A bar chart showing the proportion of supportive (i.e., RR, ARR, A) recommendations across submissions in relation to final recommendations and decisions for the CHI 2023 Papers track. The y-axis is a count of submissions, the x-axis is 0-4, depending on how many supportive reviews a given submission has had.

Figure 2: At CHI 2023, the vast majority of submissions that were accepted ended the Round 1 review process with four ‘supportive’ reviews of A, AAR or RR.

Whether you’re contemplating your revisions, or your manuscript has been rejected and you’re thinking about where to send your work next, spare a thought for the authors of 89 submissions who received a different recommendation for each of their four reviewers (8 were rejected, 81 advance to revise and resubmit).

Outcomes as a function of submission type

This year marks the resurrection of different categories of submissions under the Papers track. CHI in previous years solicited “Notes” – short submissions of up to four-pages plus references. Notes were last solicited at CHI 2017. This year, the call for papers identified three categories of submission: Short, Standard, and Excessively Long. Short submissions, of 5000 words or less, were “strongly encouraged”. Although not a facsimile of the old Note format, the call for Short submissions was intended to create a similar space for shorter pieces that perhaps make smaller contributions.

The table below shows a breakdown of submissions by type. It shows how many papers of each kind were submitted, and what the outcomes were for those papers. To keep things simple, we just focus on papers that went to full review and had either a revise and resubmit outcome or a reject outcome. We will ignore desk rejects and withdrawn papers.

Submission type	Total submissions	# moving to RR	# rejected	% moving to RR
Excessively long	58	27	31	46.6%
Short	490	110	380	22%
Standard	3269	1514	1755	46.3%

You can see that Excessively Long and Standard submissions move to RR at a similar rate. Short submissions are about half as likely as other submissions to move to RR. This pattern is similar to what we have seen in the past with Notes – acceptance rates are much lower than for ‘full’ papers.

Outcomes as a function of subcommittee

Our CHI 2024 blog post on submissions received broke down submission rates by subcommittee. How has variation across subcommittees fed into submission outcomes? To keep things straightforward again, let’s just consider submissions with RR and Reject decisions. These make up 95% of the decisions made and make plots more readable. Figure 3 shows that the User Experience subcommittee passed the lowest proportion of papers to RR (36%). The Critical and Sustainable Computing subcommittee passed the highest proportion of papers to RR (52%).

A bar chart showing the proportion of submissions moving to revise and resubmit for each of the CHI 2024 subcommittees.

Figure 3: What proportion of submissions to a given subcommittee have progressed to revise and resubmit?

In our CHI 2024 blog post on submissions received, we also showed the relative size of subcommittees in terms of the number of submissions that they received. Given the differences in the RR rates of the different subcommittees, the relative size of the subcommittees in terms of the number of papers under review has changed. Figure 4 shows that the User Experience subcommittee was biggest by number of submissions, but is now sixth biggest based on the number of submissions that remain under consideration.

A bar chart showing the count of submissions left in the CHI 2024 review process for each of the CHI 2024 subcommittees.

Figure 4: How many submissions in each subcommittee are still part of the CHI 2024 review process (and how many were submitted)?

Note that the final acceptance rate for each subcommittee will also vary. It is not necessarily the case that the rank order presented below will follow precisely for acceptance rates. These data from CHI 2023 show the difference between the RR rate (i.e., the proportion of non-desk rejected submissions invited to revise and resubmit) and the Accept rate (i.e., after RR). The difference between these two rates is as much as twelve percentage points. (Unsurprisingly, RR still very strongly correlates with Accept, r(16)=0.86, p<.001; higher RR rate yields higher Accept rate and vice versa.)

Subcommittee	RR rate	Accept rate	Percentage points difference between RR and Accept
CompInt	54%	28%	-28%
Critical	64%	34%	-28%
Devices	66%	42%	-24%
PeopleQual	56%	34%	-24%
Games	48%	28%	-22%
Privacy	54%	32%	-22%
Systems	52%	28%	-22%
Access	56%	36%	-20%
Design	50%	28%	-20%
Ibti	56%	36%	-20%
IntTech	52%	32%	-20%
Learning	44%	24%	-20%
PeopleMixed	44%	24%	-20%
PeopleQuant	50%	30%	-20%
Viz	58%	38%	-20%
Health	48%	30%	-18%
UX	42%	24%	-18%
Apps	46%	32%	-16%

Bonus Chartjunk

In our CHI 2024 blog post on submissions received, we analysed new data that has only become available in PCS, the conference submission system, in this cycle. We considered when authors created their submissions for the Papers track, and when they made their final edits before the deadline. The general theme of the analyses was that authors do things at the last minute. As reviewers are largely drawn from largely the same pool as authors, are reviews also submitted at the last minute? It looks like reviewers generally accept reviews quickly (Figure 5)… and finish them at the last moment (Figure 6) – the deadline was the 25th October! (NB – we’ve taken the first complete submission of a review as the signal for completion, but many reviewers update their reviews several times before, during and after the discussion period.)

A histogram showing when reviewers accept invitations to review papers over the course of the review period. Reviews are accepted quite quickly.

Figure 5: How quickly do reviewers accept invitations to review? Quite quickly!

A histogram showing when reviewers return reviews over the course of the review period. Final reviews are returned at the last minute, for the most part.

Figure 6: How quickly do reviewers return their reviews? At the last minute!

Datatables

It is not possible to produce useful datatables for all analyses, many of which require access to individual submission and review data that cannot be easily or safely shared. However, in the interests of transparency we are trying to make the data that support the analyses available where we can.

The data on the proportions of recommendations made by reviewers across individual reviews (not submissions!) is given by:

Recommendation	n
A	556
ARR	1481
RR	3620
RRX	5042
X	4044

Figure 1 was plotted using this summary data:

Positivity	Decision	n
0	RR	0
0	Reject	1459
0.25	RR	86
0.25	Reject	534
0.5	RR	259
0.5	Reject	70
0.75	RR	599
0.75	Reject	1
1	RR	626
1	Reject	0

Figure 2 was plotted using this summary data:

Positivity	Decision	n
0	Accept	0
0	Reject	111
0.25	Accept	0
0.25	Reject	176
0.5	Accept	6
0.5	Reject	107
0.75	Accept	58
0.75	Reject	43
1	Accept	722
1	Reject	15

Figure 3 and 4 were plotted using this data:

Subcommittee	Submissions total	Submissions to RR	RR Rate
Access	267	136	51%
Apps	200	90	45%
CompInt	236	90	38%
Critical	183	96	52%
Design	252	103	41%
Devices	122	57	47%
Games	138	53	38%
Health	264	107	41%
Ibti	161	73	45%
IntTech	271	107	39%
Learning	212	88	42%
PeopleMixed	198	80	40%
PeopleQual	233	95	41%
PeopleStat	216	92	43%
Privacy	185	94	51%
Systems	259	125	48%
UX	270	98	36%
Viz	150	67	45%

The ‘bonus chartjunk’ plots are histograms produced from PCS logs, rather than a summary table, and so it is not possible to share this data.