Evidence Standards Improve Reliability in Scholarly Peer Review

June 20, 2023, 2:30 p.m. (CEST)

Paul Ralph, Ph.D. (British Columbia) is visiting us as guest professor in GS IMTR next week. He will give a research talk. Everybody is welcome to join!

Time: June 20, 2023, 2:30 p.m. – 4:00 p.m.
Lecturer: Paul Ralph, Ph.D. (British Columbia)
Meeting mode: in presence
Venue: 2.013
Universitätsstr. 38
Download as iCal:

Background. Scholarly peer review is “the lynchpin about which the whole business of science is pivoted” (Ziman 1968). Most researchers believe peer review is effective (Ware 2008), but empirical research consistently shows that reviewers cannot reliable distinguish methodologically sound from fundamentally flawed studies (Cole 1981; Peters & Ceci 1982; Lock 1991; Rothwell and Martyn 2000; Price 2014; Ralph 2016). Consequently, Ralph et al. created comprehensive evidence standards and tools to improve peer review software engineering and related fields. 

Objective. The objective of this study is to investigate the impact of evidence standards on scholarly peer review.

Method. A randomized controlled experiment was conducted at an A-ranked software engineering conference. The program committee was randomly divided into two groups: one using a typical conference review process; the other using a standardized process based on the ACM SIGSOFT Empirical Standards for Software Engineering Research (https://acmsigsoft.github.io/EmpiricalStandards/).

Results. Evidence standards significantly improve inter-reviewer reliability without harming authors’ or reviewers’ attitudes toward the review process.

Discussion. Asking reviewers to write free-text comments about a paper and score it on a 6-point scale from strong reject to strong accept produces data statistically indistinguishable from random noise. This means that decisions are determined entirely by reviewer selection, not the merits of the research. Conventional review processes are therefore scientifically and morally indefensible. While evidence standards are not a silver bullet, standards-based review significantly improves reliability, and the data collected in this study facilitates further refinement of the standards and tooling toward still greater reliability.

Bio. Paul Ralph, Ph.D. (British Columbia), is an award-winning scientist, author, consultant, and Professor of Computer Science at Dalhousie University in Halifax, Canada. His cutting-edge research at the intersection of software engineering, human-computer interaction, and project management explores the relationship between software teams’ social dynamics and performance. Prof. Ralph’s research has been used by many leading technology companies including Adobe, Amazon, AT&T, Canon, Bea Systems, IBM, Google, HP, Microsoft, Netflix, PayPal, Samsung, Salesforce, VMWare, Yahoo!, and Walmart. Dr. Ralph has published over 80 peer-reviewed articles and book chapters in prestigious venues including the International Conference on Software Engineering and IEEE Transactions on Software Engineering. Dr. Ralph, is editor-in-chief of The Software Engineering Empirical Standards, the comprehensive reporting and reviewing guidelines for software engineering research. He was ranked the #2 software engineering researcher in Canada, and #18 in the world, by CSrankings.org (2014–2022). 

To the top of the page