Open Evaluation will improve science. Researchers constantly evaluate each other — when we submit our results for publication, when we apply for grants, and when we apply for new jobs or promotions. Peer evaluation is our quality assurance strategy. And it needs to be better.
Open access provides a context to radically reform scientific publishing. The way we evaluate scientific papers must be part of this. Our current system lacks sufficient quality and it lacks transparency.
One creative approach to reconceptualizing evaluation comes from Nikolaus Kriegeskorte, Alexander Walther and Diana Deca. These three scholars invited eighteen papers by authors ready to look beyond open access, and they summarized the nascent suggestions in their recent paper, An emerging consensus for open evaluation: 18 visions for the future of scientific publishing, appearing in Frontiers in computational neuroscience.
The importance of the Open Evaluation (OE) project is characterized in that article as follows.
Evaluation is at the heart of the entire endeavor of science. As the number of scientific publications explodes, evaluation, and selection will only gain importance. A grand challenge of our time is to design the future system, by which we evaluate papers and decide which ones deserve broad attention and deep reading.
The eighteen papers they solicited are cooked down to thirteen suggested features for OE. Eleven of the thirteen suggestions were “overwhelmingly endorsed” by the authors of the independent papers. Two of the suggestions were supported by some but doubted by others. Together, these give us a glimpse of a better system that is just around the corner.
13 measures for changing evaluation
The boldface phrases below are the words used by Kriegeskorte, Walther and Deca to describe the thirteen measures.
Traditionally, scientific articles are published but reviews are not. When reviews are published alongside articles, we move towards a situation in which the evaluation process is totally transparent. It is increasingly difficult for scientists to quickly identify the most important papers for their work; journals can assist them when evaluations are used to produce paper priority scores. Indeed, the needs of scientists differ and it should therefore be possible that anyone can define a formula for prioritizing papers.
The two measures lacking consensus reflect holdovers from the traditional approach to scholarly publishing. One reveals itself when we ask should evaluation begin with a closed, pre-publication stage. Without this, the websites of journals may be flooded with weak work. On the other hand, the sorting and prioritizing schemes could easily move us past this concern, and the loss represented by errors in the initial screening process could be significant. A milder way to address the same issue asks should the open evaluation begin with a distinct stage, in which the paper is not yet considered approved.
Connecting information about articles to various metrics on the web will supplement traditional approaches to evaluation. In a rich OE system, evaluations include written reviews, numerical ratings, usage statistics, social web information and citations. Some of these components are immediately available while others require subsequent updating, e.g., as citations appear.
The anonymity of reviews may reduce hesitation and thereby keep participation levels in the reviewing process high. On the other hand, signed reviews can help build a career when scientists make the effort to do high quality work also in this domain. When both approaches are found, the system utilizes signed (along with unsigned) evaluations.
Both anonymous and signed reviews can be done in a way such that evaluators’ identities are authenticated. Authentication provides a strategy for relating multiple reviews or for limiting the reviewing process to people with certain qualifications, e.g., research experience or affiliation with a research institution.
As we work towards improving the quality of evaluation, it may become possible to review individual reviewers and the reviews they write. If reviews and ratings are meta-evaluated, the quality of papers can be partially determined by the quality of the reviews. Hence, we imagine a system in which participating scientists are evaluated in terms of scientific or reviewing performance in order to weight paper evaluations. This could build on the meta-evaluation system — bringing together multiple reviews written by the same scientist — but it could also be built on their independently established status in their field.
The fluidity of digital publishing and the ease of adding content at any time, makes it possible to imagine a system in which evaluation is perpetually ongoing. Our understanding of the results in a paper can change over time, either enhancing or downplaying its importance. Open Evaluation will give these changes a role, independent of when they emerge.
Many of the changes proposed here can be made more precise when formal statistical inference is a key component of evaluation. A major advantage of all of these proposals is that the new system can evolve from the present one, requiring no sudden revolutionary change. Journals are free to begin implementing new approaches to reviewing scientific papers, although overall confidence in the system of scientific publication will undoubtably be enhanced if a generalized approach to Open Evaluation can be agreed upon.
What do you think about Open Evaluation? Do the 11 sure measures sound right to you? What about the two in which there was less agreement? Do you think OE in general is a good idea? How is your vision like or unlike the one presented here? How do you see the connection between Open Access and Open Evaluation? Leave a comment or share this article and we can find the foundation for a better system!
Regular readers of this blog know that there are many entries on open access here. For those of you who are new, let me help you choose by pointing out that the three most popular ones are: