The Scientific Community's Response to Evidence of
Fraudulent Publication
The Robert Slutsky Case
(JAMA. 1994;272:170-173)
William P. Whitely, MBA; Drummond Rennie, MD; Arthur W. Hafner,
PhD
Objective.--To determine whether scientists can detect
fraudulent results in published research articles and to identify
corrective measures that are most effective in purging fraudulent
results from the literature.
Design.--Retrospective case-control study comparing articles
by an author known to have published fraudulent articles, Robert A.
Slutsky, MD, to a set of control articles. The number of
non-self-citations received by each article during each calendar year
(1979 through 1990) was counted. The citation numbers were transformed
into scores. Each Slutsky article was assigned a score between 1 and 3
based on the number of citations received by the Slutsky article and
each of its assigned control articles. Average citation numbers and
scores were tracked for each year during the 11-year study period.
Results.--Before Slutsky's work was publicly questioned
(1975 to 1985), scientists cited his articles as frequently as they
cited control articles. After Slutsky's work was questioned and
reports were published in the news media (1985), scientists cited his
articles less frequently than they cited control articles. Citations
decreased further after the University of California-San Diego
published a review of the validity of Slutsky's work in 1987.
Citations did not decrease after the appearance of retractions in print
or in MEDLINE.
Conclusion.--Scientists do not, and probably cannot,
identify published articles that are fraudulent. However, when alerted
to the presence of fraudulent results in the literature, the scientific
community responds by reducing the number of citations of the tainted
articles. In the Slutsky case, general news articles and the three
reviews published by the University of California-San Diego were most
effective and retractions were least effective in purging fraudulent
results from the literature.
(JAMA. 1994;272:170-173)
SCIENTISTS recognize that it is impossible to check
everything their colleagues publish, and so the scientific enterprise
depends on substantial, mutual trust among researchers. Scientists must
assume that their colleagues select, analyze, and describe results
honestly and that, although errors may occur, no investigator
intentionally fabricates data. However, a few investigators are known
to have breached this trust by publishing results that have later been
shown to be fraudulent. How does the scientific community react to news
that a colleague has fabricated data?
Previous studies have shown that when an author is accused of
misconduct, the number of citations of his or her articles declines.
Pfeifer and Snodgrass[1] studied a group of articles that
were published and then later retracted for reasons ranging from
innocent error to intentional fabrication of data and reported that
these articles received fewer citations than a relevant control group.
Garfield and Welljams-Dorof [2] analyzed
articles published by Stephen E. Bruening, who was accused of scientific
fraud in 1988. They found that the number of citations to Bruening's
articles declined.
Our study extends previous research in two principal areas. First, we
asked whether scientists are able to detect fraudulent articles before
they are exposed. Second, we wanted to determine what types of
information (eg, general news, published retractions, MEDLINE
retractions, or scientific review articles) most effectively help
scientists to identify and censure fraudulent reports.
Our study focuses on reports authored or coauthored by Robert A.
Slutsky, MD, a cardiologist, who resigned on April 30, 1985, from his
post at the University of California-San Diego (UCSD). Slutsky's
resignation was precipitated when a fellow faculty member raised
questions about apparent duplications in two of Slutsky's published
articles. Shortly afterward, an investigating committee requested
retraction of two fraudulent articles, and Slutsky's lawyer asked that
15 articles, published in eight journals, be retracted.[3]
Slutsky's case is unusual in that he published 137 articles in 7
years, a rate of one every 13 working days. It is unique in that a
faculty committee investigated the integrity of the data in every
article in which Slutsky's name appeared. Hence, we have a clear idea
of the actual scientific integrity, the trustworthiness, of his various
publications. The faculty committee concluded that of the 137 articles,
77 were valid, 48 were questionable, and 12 were
fraudulent.[3] Because it is difficult to understand how
scientists could cite an article they know to be "questionable," we
refer to the questionable and fraudulent articles collectively as
"nonvalid" articles.
While news of Slutsky's fabrications may have begun to leak out as
early as April 1985, his misconduct became generally known on September
12, 1985, when the Los Angeles Times broke the story. The
newspaper published five articles during the period from September 12th
through 25th detailing Slutsky's misconduct, his resignation, his
effort to join a group medical practice in New York, and his release
from that practice on September 20 (Los Angeles Times, Home
Edition. September 12, 1985:1; Los Angeles Times, San Diego
County Edition. September 12, 1985:1; Los Angeles Times, San
Diego County Edition. September 17, 1985:1; Los Angeles Times,
San Diego County Edition. September 21, 1985:3; Los Angeles
Times, San Diego County Edition. September 25, 1985:2). Eleven
months later, on October 9, 1986, a second UCSD committee investigating
Slutsky released its findings, which were widely reported in newspapers
and scientific periodicals throughout the month of October (Los
Angeles Times, San Diego County Edition. October 9, 1986:1;
The Associated Press. October 9, 1986; The New York
Times, Late City Final Edition. October 10, 1986:6; Chicago
Tribune, National Edition. October 10,
1986:6). [4]
[5]
[6]
On November 26, 1987, the UCSD committee published its full results in the
New England Journal of Medicine.[3]
METHODS
Test Group and Control Groups
We began our study by reviewing a bibliography of Slutsky's
publications[3] The bibliography showed that Slutsky
published 137 articles in 31 different journals. To facilitate data
collection and to reduce interjournal differences, we concentrated on
the five journals that published the greatest number of Slutsky
articles. These five journals were the American Heart Journal,
the American Journal of Cardiology, Circulation,
Investigative Radiology, and Radiology. Each of these
five journals published at least 10 articles by Slutsky, and the five
journals together accounted for 86 (63%) of Slutsky's 137 articles.
We assigned two unique control articles to each Slutsky article. We
chose the two control articles at random from the set of all
non-Slutsky articles that appeared in the same section of the same
issue of the same journal as the Slutsky article. To select control
articles, we assigned consecutive numbers to each candidate article and
then selected two of the articles using a random-number generator.
Data Retrieval and Verification
We collected citation data for 86 Slutsky articles and 172 control
articles (a total of 258 cited reference articles) using a database of
citation information, the SCISEARCH database, which is produced by the
Institute for Scientific Information (ISI), Philadelphia, Pa. We
excluded self-citations (citing references that were written by one of
the authors of the cited reference) and articles and editorials on the
topic of fraud. The remaining citations reflected the use of the
article by objective researchers in ongoing medical research.
Statistical Methods and Analysis
We compared Slutsky articles with control articles by plotting the
average number of citations to both types of articles during the
11-year study period. To measure the statistical significance of the
difference in the number of citations received by Slutsky and control
articles, we transformed the citation data into scores. We assigned
each Slutsky article a score between 1 and 3 based on the number of
citations received by the Slutsky article and each of its two control
articles. If the Slutsky article had the most citations, it received a
score of 3. If the Slutsky article had more citations than one control
article but less than the other, it received a score of 2. If the
Slutsky article had the fewest citations, it received a score of 1. In
the event of ties, we assigned scores of 1.5 or 2.5. If all three
articles received zero citations, we dropped the data point from the
analysis.
To test statistical significance, we relied on the expectation that if,
on average, Slutsky articles received the same number of citations as
control articles, then the distribution of scores (1, 2, 3) for Slutsky
articles would be uniform and the expected average score would be 2.
Furthermore, the normal approximation of this uniform distribution
would be a mean of 2 and an SD of the square root of
([0.6667/(n-1]), where n would be the number of
Slutsky articles included in the average score. Using the normal
approximation, we calculated the probability that the actual score
received by Slutsky articles was significantly different from the
expected score.
Differences Between Valid and Nonvalid Slutsky Articles
By transforming citation data into scores, we were also able to detect
differences between the performance of Slutsky's valid and nonvalid
articles. Because the expected score for both valid and nonvalid
articles is 2, the two sets of articles can be plotted on the same
graph and compared. Thus, we first plotted valid articles against
nonvalid articles for every year during the 11-year study period.
Twenty-eight of Slutsky's 37 nonvalid articles were retracted in
print. To determine whether the appearance of a printed retraction
helped to drive down citation rates, we plotted average score for these
28 articles against average score for the nine articles that were not
retracted in print.
Nine of Slutsky's 37 nonvalid articles were retracted in the National
Library of Medicine's MEDLINE database.[7] To determine
whether the appearance of a retraction notice in MEDLINE helped to
drive down citation rates, we plotted average score for these nine
articles against the average score of the 28 nonvalid articles that
were not retracted in MEDLINE.
RESULTS
Figure 1 summarizes the average citation performance of
Slutsky articles relative to control articles during the entire 11-year
study period. Before Slutsky's fraudulent activities were exposed
(1979 to 1985), Slutsky articles and control articles received
approximately the same number of citations. However, after Slutsky's
misconduct was exposed (1986 to 1990), Slutsky articles received
approximately one less citation per article per year.
Figure 2 shows the data in Figure 1 transformed into
a score. The scores received by Slutsky articles declined below the
expected score of 2 at about the same time news of Slutsky's
misconduct began to appear in the lay and scientific press.
The Table confirms that before 1985, the scores
received by Slutsky's articles were not significantly different from
the expected score of 2. After 1985, the scores received by Slutsky
articles were significantly lower than the expected scores.
Figure 3 divides Slutsky's articles into those that were
judged by the UCSD committee to be valid and those that were judged to
be nonvalid. It shows that after 1985, Slutsky's nonvalid articles
received consistently lower scores than valid articles and that the
widest divergence in scores occurred after 1987.
Figure 3 also shows that in 1990, the score of nonvalid articles
increased. This increase does not mean that the number of citations of
Slutsky articles increased. In fact, the number of citations to Slutsky
articles decreased. The increase in the score of nonvalid articles
occurred because the number of citations to control articles declined
more broadly and steeply than the number of citations to Slutsky
articles. In 1990, 42 (57%) of 74 control articles received fewer
citations than in 1989. In contrast, only six (16%) of 37 Slutsky
articles received fewer citations in 1990 compared with 1989. Nine
(12%) of 74 control articles received more citations in 1990 than in
1989, and three (8%) of 37 Slutsky articles received more citations in
1990 than in 1989. For raw citations, the number of citations of
control articles decreased by 0.91 citations per article from 2.11 to
1.20 citations. The number of citations to Slutsky articles decreased
by only 0.14 citations per article, from 0.30 to 0.16 citations. If we
had collected additional years of citations, they would probably shown
a continued rise in the score as the number of citations to control
articles continued to drop toward zero.
As described in the "Methods" section, we also plotted retracted
articles (printed and MEDLINE retractions) against unretracted
articles. We detected no significant difference in the scores received
by retracted and unretracted articles.
COMMENT
The first step in our statistical analysis was to transform raw
citation data into scores. The advantages of the scoring method are
that it is simple to calculate, is easy to interpret, and reduces
variability in the underlying raw data. The limits of the scoring
method are that it does not report actual citation counts and it
reduces the power of the statistical tests compared with alternatives
such as logistic regression. Because we were able to achieve sufficient
levels of statistical significance using the scoring method, we decided
that it was unnecessary to use a more powerful but more complicated
statistical approach.
It is our experience that readers with no more than the published
article to go on cannot recognize fraudulent work that is done by a
sophisticated researcher. Sloppy fabrications may be screened out by
editorial and peer review, but clever fabrications cannot be detected,
except in the institution where the research is performed or when
attempts to replicate the results fail or when laboratory notes,
patient charts, and calculations are thoroughly reviewed by a critical
and objective third party.
Indeed, in the case of Slutsky, we show that scientists did not suspect
that some of Slutsky's articles were fraudulent until articles about
Slutsky's misconduct appeared in the lay press in 1985. Before 1985,
Slutsky articles received approximately the same number of citations as
did control articles. After 1985, Slutsky articles received fewer
citations than did control articles and their scores were significantly
below the expected score of 2.
We also show that the UCSD report may have been especially effective in
helping scientists to identify and purge fraudulent articles. From 1985
to 1987, the scores of Slutsky's valid and nonvalid articles declined
in tandem. However, after the UCSD report was released (1986) and
published (1987), the scores of nonvalid articles declined relative to
the scores of valid articles. The spread between the scores of nonvalid
and valid articles was at its widest points in 1988 and 1989.
Finally, we could detect no correlation between the publication of
retractions and citation rates.
CONCLUSIONS
By studying the pattern of citations of Slutsky's articles, we found
that scientists could not detect fraudulent articles that were
sufficiently cleverly fabricated to pass editorial peer review.
However, once Slutsky's misconduct was exposed, scientists reacted by
citing his work less than they cited matched control articles. In the
Slutsky case, general news articles and the review published by the
UCSD committee most effectively decreased the number of citations of
Slutsky articles.
Scientists clearly heeded the evaluations and warnings of the UCSD
committee. The fact that the committee had been convened appeared in
the lay press in 1985 and this immediately drove down the number of
citations of all of Slutsky's articles. In 1986, the UCSD committee
released its report, which set off a second round of reduction in
citations, especially of articles that were nonvalid. Our results
suggest that academic institutions can play a key role in purging
fraudulent research from the scientific literature.
From the Department of Information Analysis, American Medical
Association (Mr Whitely and Dr Hafner), and
JAMA (Dr Rennie),
Chicago, Ill. Mr Whitely is now with Price Waterhouse, San Francisco,
Calif, and Dr Hafner is a consultant in Glenview, Ill.
Presented in part at the Second International Congress on Peer Review
in Biomedical Publication, Chicago, Ill, September 11, 1993.
We express our appreciation to Edgar J. Asebey for his
participation in an earlier study that led to this research.
Reprint requests to Price Waterhouse, 555 California St, San Francisco,
CA 94104 (Mr Whitely).
References
1. Pfeifer MP, Snodgrass GL. The continued use of
retracted, invalid scientific literature. JAMA.
1990;263:1420-1423.
2. Garfield E, Welljams-Dorof A. The impact of fraudulent
research on the scientific literature: the Stephen E. Breuning Case.
JAMA. 1990;263:1424-1426.
3. Engler RL, Covell JW, Friedman PJ, Kitcher PS, Peters
RM. Misrepresentation and responsibility in medical research. N
Engl J Med. 1987;317:1383-1389.
4. Locke R. Another damned by publications.
Nature. 1986;324:401.
5. Greenberg DS. Massive research fraud uncovered at UC San
Diego. Sci Government Rep. 1986;16:4-5.
6. Marshall E. San Diego's tough stand on research fraud.
Science. 1986;234:534-535.
7. Friedman PJ. Correcting the literature following
fraudulent publication. JAMA. 1990;263:1416-1419.
Table of Contents