LINKS
Scientific Misconduct and Peer Review

The Scientific Community's Response to Evidence of Fraudulent Publication

The Robert Slutsky Case

(JAMA. 1994;272:170-173)

William P. Whitely, MBA; Drummond Rennie, MD; Arthur W. Hafner, PhD

Objective.--To determine whether scientists can detect fraudulent results in published research articles and to identify corrective measures that are most effective in purging fraudulent results from the literature.

Design.--Retrospective case-control study comparing articles by an author known to have published fraudulent articles, Robert A. Slutsky, MD, to a set of control articles. The number of non-self-citations received by each article during each calendar year (1979 through 1990) was counted. The citation numbers were transformed into scores. Each Slutsky article was assigned a score between 1 and 3 based on the number of citations received by the Slutsky article and each of its assigned control articles. Average citation numbers and scores were tracked for each year during the 11-year study period.

Results.--Before Slutsky's work was publicly questioned (1975 to 1985), scientists cited his articles as frequently as they cited control articles. After Slutsky's work was questioned and reports were published in the news media (1985), scientists cited his articles less frequently than they cited control articles. Citations decreased further after the University of California-San Diego published a review of the validity of Slutsky's work in 1987. Citations did not decrease after the appearance of retractions in print or in MEDLINE.

Conclusion.--Scientists do not, and probably cannot, identify published articles that are fraudulent. However, when alerted to the presence of fraudulent results in the literature, the scientific community responds by reducing the number of citations of the tainted articles. In the Slutsky case, general news articles and the three reviews published by the University of California-San Diego were most effective and retractions were least effective in purging fraudulent results from the literature.

(JAMA. 1994;272:170-173)


SCIENTISTS recognize that it is impossible to check everything their colleagues publish, and so the scientific enterprise depends on substantial, mutual trust among researchers. Scientists must assume that their colleagues select, analyze, and describe results honestly and that, although errors may occur, no investigator intentionally fabricates data. However, a few investigators are known to have breached this trust by publishing results that have later been shown to be fraudulent. How does the scientific community react to news that a colleague has fabricated data?

Previous studies have shown that when an author is accused of misconduct, the number of citations of his or her articles declines. Pfeifer and Snodgrass[1] studied a group of articles that were published and then later retracted for reasons ranging from innocent error to intentional fabrication of data and reported that these articles received fewer citations than a relevant control group. Garfield and Welljams-Dorof [2] analyzed articles published by Stephen E. Bruening, who was accused of scientific fraud in 1988. They found that the number of citations to Bruening's articles declined.

Our study extends previous research in two principal areas. First, we asked whether scientists are able to detect fraudulent articles before they are exposed. Second, we wanted to determine what types of information (eg, general news, published retractions, MEDLINE retractions, or scientific review articles) most effectively help scientists to identify and censure fraudulent reports.

Our study focuses on reports authored or coauthored by Robert A. Slutsky, MD, a cardiologist, who resigned on April 30, 1985, from his post at the University of California-San Diego (UCSD). Slutsky's resignation was precipitated when a fellow faculty member raised questions about apparent duplications in two of Slutsky's published articles. Shortly afterward, an investigating committee requested retraction of two fraudulent articles, and Slutsky's lawyer asked that 15 articles, published in eight journals, be retracted.[3] Slutsky's case is unusual in that he published 137 articles in 7 years, a rate of one every 13 working days. It is unique in that a faculty committee investigated the integrity of the data in every article in which Slutsky's name appeared. Hence, we have a clear idea of the actual scientific integrity, the trustworthiness, of his various publications. The faculty committee concluded that of the 137 articles, 77 were valid, 48 were questionable, and 12 were fraudulent.[3] Because it is difficult to understand how scientists could cite an article they know to be "questionable," we refer to the questionable and fraudulent articles collectively as "nonvalid" articles.

While news of Slutsky's fabrications may have begun to leak out as early as April 1985, his misconduct became generally known on September 12, 1985, when the Los Angeles Times broke the story. The newspaper published five articles during the period from September 12th through 25th detailing Slutsky's misconduct, his resignation, his effort to join a group medical practice in New York, and his release from that practice on September 20 (Los Angeles Times, Home Edition. September 12, 1985:1; Los Angeles Times, San Diego County Edition. September 12, 1985:1; Los Angeles Times, San Diego County Edition. September 17, 1985:1; Los Angeles Times, San Diego County Edition. September 21, 1985:3; Los Angeles Times, San Diego County Edition. September 25, 1985:2). Eleven months later, on October 9, 1986, a second UCSD committee investigating Slutsky released its findings, which were widely reported in newspapers and scientific periodicals throughout the month of October (Los Angeles Times, San Diego County Edition. October 9, 1986:1; The Associated Press. October 9, 1986; The New York Times, Late City Final Edition. October 10, 1986:6; Chicago Tribune, National Edition. October 10, 1986:6). [4] [5] [6] On November 26, 1987, the UCSD committee published its full results in the New England Journal of Medicine.[3]

METHODS

Test Group and Control Groups

We began our study by reviewing a bibliography of Slutsky's publications[3] The bibliography showed that Slutsky published 137 articles in 31 different journals. To facilitate data collection and to reduce interjournal differences, we concentrated on the five journals that published the greatest number of Slutsky articles. These five journals were the American Heart Journal, the American Journal of Cardiology, Circulation, Investigative Radiology, and Radiology. Each of these five journals published at least 10 articles by Slutsky, and the five journals together accounted for 86 (63%) of Slutsky's 137 articles.

We assigned two unique control articles to each Slutsky article. We chose the two control articles at random from the set of all non-Slutsky articles that appeared in the same section of the same issue of the same journal as the Slutsky article. To select control articles, we assigned consecutive numbers to each candidate article and then selected two of the articles using a random-number generator.

Data Retrieval and Verification

We collected citation data for 86 Slutsky articles and 172 control articles (a total of 258 cited reference articles) using a database of citation information, the SCISEARCH database, which is produced by the Institute for Scientific Information (ISI), Philadelphia, Pa. We excluded self-citations (citing references that were written by one of the authors of the cited reference) and articles and editorials on the topic of fraud. The remaining citations reflected the use of the article by objective researchers in ongoing medical research.

Statistical Methods and Analysis

We compared Slutsky articles with control articles by plotting the average number of citations to both types of articles during the 11-year study period. To measure the statistical significance of the difference in the number of citations received by Slutsky and control articles, we transformed the citation data into scores. We assigned each Slutsky article a score between 1 and 3 based on the number of citations received by the Slutsky article and each of its two control articles. If the Slutsky article had the most citations, it received a score of 3. If the Slutsky article had more citations than one control article but less than the other, it received a score of 2. If the Slutsky article had the fewest citations, it received a score of 1. In the event of ties, we assigned scores of 1.5 or 2.5. If all three articles received zero citations, we dropped the data point from the analysis.

To test statistical significance, we relied on the expectation that if, on average, Slutsky articles received the same number of citations as control articles, then the distribution of scores (1, 2, 3) for Slutsky articles would be uniform and the expected average score would be 2. Furthermore, the normal approximation of this uniform distribution would be a mean of 2 and an SD of the square root of ([0.6667/(n-1]), where n would be the number of Slutsky articles included in the average score. Using the normal approximation, we calculated the probability that the actual score received by Slutsky articles was significantly different from the expected score.

Differences Between Valid and Nonvalid Slutsky Articles

By transforming citation data into scores, we were also able to detect differences between the performance of Slutsky's valid and nonvalid articles. Because the expected score for both valid and nonvalid articles is 2, the two sets of articles can be plotted on the same graph and compared. Thus, we first plotted valid articles against nonvalid articles for every year during the 11-year study period.

Twenty-eight of Slutsky's 37 nonvalid articles were retracted in print. To determine whether the appearance of a printed retraction helped to drive down citation rates, we plotted average score for these 28 articles against average score for the nine articles that were not retracted in print.

Nine of Slutsky's 37 nonvalid articles were retracted in the National Library of Medicine's MEDLINE database.[7] To determine whether the appearance of a retraction notice in MEDLINE helped to drive down citation rates, we plotted average score for these nine articles against the average score of the 28 nonvalid articles that were not retracted in MEDLINE.

RESULTS

Figure 1 summarizes the average citation performance of Slutsky articles relative to control articles during the entire 11-year study period. Before Slutsky's fraudulent activities were exposed (1979 to 1985), Slutsky articles and control articles received approximately the same number of citations. However, after Slutsky's misconduct was exposed (1986 to 1990), Slutsky articles received approximately one less citation per article per year.

Figure 2 shows the data in Figure 1 transformed into a score. The scores received by Slutsky articles declined below the expected score of 2 at about the same time news of Slutsky's misconduct began to appear in the lay and scientific press.

The Table confirms that before 1985, the scores received by Slutsky's articles were not significantly different from the expected score of 2. After 1985, the scores received by Slutsky articles were significantly lower than the expected scores.

Figure 3 divides Slutsky's articles into those that were judged by the UCSD committee to be valid and those that were judged to be nonvalid. It shows that after 1985, Slutsky's nonvalid articles received consistently lower scores than valid articles and that the widest divergence in scores occurred after 1987.

Figure 3 also shows that in 1990, the score of nonvalid articles increased. This increase does not mean that the number of citations of Slutsky articles increased. In fact, the number of citations to Slutsky articles decreased. The increase in the score of nonvalid articles occurred because the number of citations to control articles declined more broadly and steeply than the number of citations to Slutsky articles. In 1990, 42 (57%) of 74 control articles received fewer citations than in 1989. In contrast, only six (16%) of 37 Slutsky articles received fewer citations in 1990 compared with 1989. Nine (12%) of 74 control articles received more citations in 1990 than in 1989, and three (8%) of 37 Slutsky articles received more citations in 1990 than in 1989. For raw citations, the number of citations of control articles decreased by 0.91 citations per article from 2.11 to 1.20 citations. The number of citations to Slutsky articles decreased by only 0.14 citations per article, from 0.30 to 0.16 citations. If we had collected additional years of citations, they would probably shown a continued rise in the score as the number of citations to control articles continued to drop toward zero.

As described in the "Methods" section, we also plotted retracted articles (printed and MEDLINE retractions) against unretracted articles. We detected no significant difference in the scores received by retracted and unretracted articles.

COMMENT

The first step in our statistical analysis was to transform raw citation data into scores. The advantages of the scoring method are that it is simple to calculate, is easy to interpret, and reduces variability in the underlying raw data. The limits of the scoring method are that it does not report actual citation counts and it reduces the power of the statistical tests compared with alternatives such as logistic regression. Because we were able to achieve sufficient levels of statistical significance using the scoring method, we decided that it was unnecessary to use a more powerful but more complicated statistical approach.

It is our experience that readers with no more than the published article to go on cannot recognize fraudulent work that is done by a sophisticated researcher. Sloppy fabrications may be screened out by editorial and peer review, but clever fabrications cannot be detected, except in the institution where the research is performed or when attempts to replicate the results fail or when laboratory notes, patient charts, and calculations are thoroughly reviewed by a critical and objective third party.

Indeed, in the case of Slutsky, we show that scientists did not suspect that some of Slutsky's articles were fraudulent until articles about Slutsky's misconduct appeared in the lay press in 1985. Before 1985, Slutsky articles received approximately the same number of citations as did control articles. After 1985, Slutsky articles received fewer citations than did control articles and their scores were significantly below the expected score of 2.

We also show that the UCSD report may have been especially effective in helping scientists to identify and purge fraudulent articles. From 1985 to 1987, the scores of Slutsky's valid and nonvalid articles declined in tandem. However, after the UCSD report was released (1986) and published (1987), the scores of nonvalid articles declined relative to the scores of valid articles. The spread between the scores of nonvalid and valid articles was at its widest points in 1988 and 1989.

Finally, we could detect no correlation between the publication of retractions and citation rates.

CONCLUSIONS

By studying the pattern of citations of Slutsky's articles, we found that scientists could not detect fraudulent articles that were sufficiently cleverly fabricated to pass editorial peer review. However, once Slutsky's misconduct was exposed, scientists reacted by citing his work less than they cited matched control articles. In the Slutsky case, general news articles and the review published by the UCSD committee most effectively decreased the number of citations of Slutsky articles.

Scientists clearly heeded the evaluations and warnings of the UCSD committee. The fact that the committee had been convened appeared in the lay press in 1985 and this immediately drove down the number of citations of all of Slutsky's articles. In 1986, the UCSD committee released its report, which set off a second round of reduction in citations, especially of articles that were nonvalid. Our results suggest that academic institutions can play a key role in purging fraudulent research from the scientific literature.


From the Department of Information Analysis, American Medical Association (Mr Whitely and Dr Hafner), and JAMA (Dr Rennie), Chicago, Ill. Mr Whitely is now with Price Waterhouse, San Francisco, Calif, and Dr Hafner is a consultant in Glenview, Ill.

Presented in part at the Second International Congress on Peer Review in Biomedical Publication, Chicago, Ill, September 11, 1993.

We express our appreciation to Edgar J. Asebey for his participation in an earlier study that led to this research.

Reprint requests to Price Waterhouse, 555 California St, San Francisco, CA 94104 (Mr Whitely).


References

1. Pfeifer MP, Snodgrass GL. The continued use of retracted, invalid scientific literature. JAMA. 1990;263:1420-1423.

2. Garfield E, Welljams-Dorof A. The impact of fraudulent research on the scientific literature: the Stephen E. Breuning Case. JAMA. 1990;263:1424-1426.

3. Engler RL, Covell JW, Friedman PJ, Kitcher PS, Peters RM. Misrepresentation and responsibility in medical research. N Engl J Med. 1987;317:1383-1389.

4. Locke R. Another damned by publications. Nature. 1986;324:401.

5. Greenberg DS. Massive research fraud uncovered at UC San Diego. Sci Government Rep. 1986;16:4-5.

6. Marshall E. San Diego's tough stand on research fraud. Science. 1986;234:534-535.

7. Friedman PJ. Correcting the literature following fraudulent publication. JAMA. 1990;263:1416-1419.

Table of Contents