LINKS
Poster Session Presentations, September 18

RELATIONSHIP OF EDITORIAL RATINGS OF REVIEWERS TO REVIEWER PERFORMANCE ON A STANDARDIZED TEST INSTRUMENT

Michael L Callaham,1,2 Joseph Waeckerle,1,3 and William Baxt1,4
1Annals of Emergency Medicine; 2University of California, San Francisco, PO Box 0208, San Francisco, CA 94143-0208, USA; 3University of Missouri at Kansas City, School of Medicine, Kansas City, KS, USA; 4Department of Emergency Medicine, University of Pennsylvania, Philadelphia, PA, USA

Objective: Whether editorial ratings of peer reviewers are an accurate reflection of a reviewer's ability to detect manuscript flaws.

Design: Thirty editors at Annals of Emergency Medicine routinely rate reviews of submitted manuscripts on a subjective 1 to 5 ordinal scale. All active reviewers were separately sent a fictitious test manuscript to review in the fall of 1994 (blinded to its true purpose), which possessed 23 deliberate flaws. Reviewer ratings, reviewer performance calculations, and measures of reviewer experience were compared to reviewer ability to detect these test manuscript flaws.

Results: Seventy-eight percent of those available to review in the fall of 1994 evaluated the fictitious manuscript; 127 reviewers detected a mean 3.4 (1.6) of the 10 major flaws and 3.1 (1.7) of the 13 minor flaws. These same reviewers reviewed a mean 7.5 submitted genuine manuscripts (SD 4.2) between January 1994 and August 1996 and were rated by 30 editors. The mean editorial rating for each reviewer was modestly correlated with the number of major or minor flaws they detected (R=.044 and 0.43, respectively). Each rating point equated to 1 more major error detected. Individual reviewer acceptance rate and congruence with editorial decisions were not associated with detection of errors (R=.22 and .16, respectively). Years of experience reviewing, volume of reviews for the journal, number of authored manuscripts, and combinations thereof did not predict performance on the fictitious manuscript or reviewer ratings.

Conclusions: A subjective ordinal rating scale applied by editors to reviews of submitted manuscripts was only modestly correlated with the ability of a blinded reviewer to detect deliberate flaws in a test manuscript. Measures of individual reviewer acceptance rate, reviewer congruence with editorial decisions, and reviewer experience were not correlated with error detection.

Return to Session Information