Traditional scientific communication directly threatens the quality of scientific research. Today’s system is unreliable — or worse! Our system of scholarly publishing reliably gives the highest status to research that is most likely to be wrong.
This system determines the trajectory of scientific careers. The longer we stick with it, the more likely it will become even worse.
These claims and the problems described below are grounded in research recently presented by Björn Brembs and Marcus Munafò in Deep Impact: Unintended consequences of journal rank.
Retraction is one possible response to discovering that something is wrong with a published scientific article. When it works well, journals publish a retraction statement identifying the reason for the retraction.
Retraction rates have increased tenfold in the past decade, after many years of stability, and a new paper in the Proceedings of the National Academy of Sciences demonstrates that two-thirds of all retractions follow from scientific misconduct: fraud, duplicate publication and plagiarism (Ferric C. Fang, R. Grant Steen & Arturo Casadevall: Misconduct accounts for the majority of retracted scientific publications)
Even more disturbing is the finding that the most prestigious journals have the highest rates of retraction, and that fraud and misconduct are greater sources of retraction in these journals than in less prestigious ones.
Among articles that are not retracted, there is evidence that the most visible journals publish less reliable (i.e., less replicable) research results than lower ranking journals. This may be due to a preference among prestigious journals for results that have more spectacular or novel findings, a phenomenon known as publication bias (e.g. P.J. Easterbrook, R. Gopalan, J.A. Berlin and D.R. Matthews, Publication bias in clinical research, The Lancet). Publication bias, in turn, is a direct cause of the decline effect.
The decline effect
One cornerstone of the quality control system in science is replicability; research results should be so carefully described that they can be obtained by others who follow the same procedure. Yet journals generally are not interested in publishing mere replications, giving this particular quality control measure somewhat low status, independent of how important it is, e.g. in studying potential new medicines.
When studies are reproduced, the resulting evidence is often weaker than in the original study. Indeed, Brembs and Munafò review research leading them to claim that “the strength of evidence for a particular finding often declines over time.”
In a fascinating piece entitled The truth wears off, the New Yorker offers the following interpretation of the decline effect.
The most likely explanation for the decline is an obvious one: regression to the mean. As the experiment is repeated, that is, an early statistical fluke gets cancelled out.
Yet it is exactly the spectacularity of statistical flukes that increase the odds of getting published in a high prestige journal.
The politics of prestige
One approach to measuring the importance of a journal is to count how many times scientists cite its articles; this strategy has been formalized as impact factor. Publishing in journals with high impact factors feeds job offers, grants, awards, and promotions. A high impact factor also enhances the popularity — and profitability — of a journal, and journal editors and publishers work hard to increase them, primarily by trying to publish what they believe will be the most important papers.
However, impact factor can also be illegitimately manipulated. For example, the actual calculation of impact factor involves dividing the total number of citations in recent years by the number of articles published in the journal in the same period. But what is an article? Do editorials count? What about reviews, replies or comments?
By negotiating to exclude some pieces from the denominator in this calculation, publishers can increase the impact factor of their journals. In The impact factor game, the editors of PLoS Medicine describe the negotiations determining their impact factor. An impact factor in the 30s is extremely high, while most journals are under 1. The PLoS Medicine negotiations considered candidate impact factors ranging from 4 to 11. This process led the editors to “conclude that science is currently rated by a process that is itself unscientific, subjective, and secretive.”
A more cynical strategy for raising impact factor is when editors ask authors to cite more articles from their journals, as I described in How journals manipulate the importance of research and one way to fix it.
A crisis for science
The problems discussed here are a crisis for science and the institutions that fund and carry out research. We have a system for communicating results in which the need for retraction is exploding, the replicability of research is diminishing, and the most standard measure of journal quality is becoming a farce.
Ranking journals is at the heart of all three of these problems. For this reason, Brembs and Munafò conclude that the system is so broken it should be abandoned.
Getting past this crisis will require both systemic and cultural changes. Citations of individual articles can be a good indicator of quality, but the excellence of individual articles does not correlate with the impact factor of the journals in which they are published. When we have convinced ourselves of that, we must see the consequences it has for the evaluation processes essential to the construction of careers in science and we must push nascent alternatives such as Google Scholar and others forward.
Politicians have a legitimate need to impose accountability, and while the ease of counting — something, anything — makes it tempting for them to infer quality from quantity, it doesn’t take much reflection to realize that this is a stillborn strategy.
As long as we believe that research represents one of the few true hopes for moving society forward, then we have to face this crisis. It will be challenging, but there is no other choice.
What’s your take on publishing in your field? Are these issues relevant? Are you concerned? If so, I invite you to leave a comment below, or to help keep the discussion going by posting this on Facebook, Twitter, or your favorite social medium.
For a little more on Brembs and Munafò’s article, see a brief note by Brembs on the London School of Economics’ Impact of Social Science blog and discussion at Physics Today’s blog. And don’t forget to follow Retraction Watch.
Earlier posts on this blog discussing the need to change our publishing model include:
- Open evaluation: 11 sure steps — and 2 maybes — towards a new approach to peer review
- How whale hunting can improve scientific publishing
- New approaches to quality control in publishing
Finally, a disclosure: The New Yorker article cited above was written by Jonah Lehrer, one of the subjects of my piece: Whaddaya mean plagiarism? I wrote it myself! How open access can eliminate self-plagiarism.
This posting subsequently has appeared at The Guardian as Science research: 3 problems that point to a communications problem