Open Access

How the internet can make knowledge disappear and 2 ways to stop it

Knowledge is vulnerable. It’s hard to come by and far too often, it disappears. Every researcher on the planet surely has some result lying in a drawer, not being shared, largely unknown. Maybe it’s not up to their standards yet, or perhaps it’s a negative result. More likely, they tried to publish it, but the process became too arduous.

The results are there. They are knowledge. And when they never get shared, they eventually disappear.

The green open access model is not a good long-term approach. We need gold open access, and we need for big publishers to make the switch.

The internet offers some help. It makes it easier to preserve knowledge. There are new ways to communicate and document research results. It has also become easier to store and share data.

Risking the loss of knowledge

Some changes brought about by the internet initially appear superficial. For example, we are in the midst of a transition away from publishing articles on paper towards publishing online.

Ironically, one important consequence of the shift to digital publishing is that it leads to a potential loss of knowledge. This arises from an ambiguity in the way digital journals are sold and preserved. What exactly is a library buying from a publisher? Is it a product, as with paper journals or books? Or, is it a service, such as access to a database?

These questions become important not least of all when a library discontinues a subscription. If a digital journal is a product, the purchased issues should still be available. If, on the other hand, the library has purchased access to a database for a particular amount of time, then they likely won’t have access to anything when they stop paying, not even to the articles they could read when they did pay.

In Norway, some of the purchasing from publishers is coordinated at the national level. Because I head the board of the coordinating organization, I know that various publishers view this issue differently. We’re far from having achieved a standard solution.

The influential Finch Report in the United Kingdom puts it like this last year.

The role of research libraries in ensuring the long-term preservation of print does not readily transfer to digital content. We are still some way from robust arrangements for the long term preservation of digital journals so that they remain accessible for future generations.

The role of libraries

We might think that ongoing access to research articles could be preserved if libraries simply archived all the articles and journals that they subscribe to digitally, just like they archive earlier issues of paper journals.

Publishers generally don’t allow this, but even if they did, the quantity is unmanageable. The largest publishers no longer sell subscriptions to specific journals; instead, they make very large packages that the libraries must purchase, even if they only want a few of the journals in that package.

A consequence of buying in bundles, however, is that libraries end up with access to tens of thousands of articles, many of which are of no interest to their researchers. So even if they could, they may not want to use resources archiving that material.

Combining the various legal restrictions — which have different details at different publishers — with the technical and practical challenges, the preservation of scientific articles and the knowledge they contain has become a precarious enterprise.

Responsibility for archiving has been subtly transferred from libraries to publishers. This makes knowledge vulnerable. What happens when a publisher goes bankrupt? What happens if their systems break down? What happens if human error leads to large scale deletion? What if they can’t keep up with technical advancements? What if they decide to raise their fees?

Open access is part — but only part — of the solution

It is easy to imagine scenarios whereby an archive of knowledge that is primarily maintained in one place gets damaged or lost or somehow becomes inaccessible. When that happens, knowledge is lost.

Fortunately, there are at least two different steps that can radically reduce the chances of such problems.

First of all, there’s open access. When publishers implement open access strategies — and when researchers make use of them — articles become freely available and they can be archived locally.

But this isn’t enough. Even with open access, there are practical challenges. University archives must decide on a strategy and find technical solutions. Will they build an archive only of work produced by their own employees, or will they build an archive of all open access articles they subscribe to?

Another challenge with archiving open access articles comes from different approaches to open access. The so-called gold open access model is one in which the published versions of articles are put into the public domain.

Green open access

The green open access model is both more common and more complicated. In this approach, publishers allow authors to place in open archives a non-final version of their article. One problem with this approach is that two different versions of the article are then being used: the published version and the archived pre-publication version.

Another problem is variation in restrictions on archiving. Sometimes, the regulations that publishers impose become absurd. Consider, for example, what I like to call Elsevier’s anti-policy policy. Elsevier allows individual researchers to place a non-final version of their article in an institutional archive, e.g. at their home university.

They allow this, that is, unless the researcher’s home university has a policy requiring research results to be placed in an archive — which many do, to encourage increased public access to the results of publicly financed research. If you’re publishing in one of Elsevier’s journals, you can post a non-final version of your paper in an archive … unless you have to. In that case, you can’t.

The green open access model is not a good long-term approach. We need gold open access, and we need for big publishers to make the switch.

Massive independent archives

The second way to counter a potential loss of knowledge with digital publishing comes from creative new approaches to large-scale archiving. Libraries, publishers, and an independent archiving organization form a coalition to create archives of the publishers’ journals. Two prominent organizations doing this work are Portico and LOCKSS. They have different strategies, but both work to reduce the vulnerability of digital materials.

It’s hard to know how much of the material being published today is covered by Portico and LOCKSS. A few years ago, Rutgers University estimated that less than half the material they subscribed to was covered by one or the other. LOCKSS has about 500 publishers participating, while Portico has a little over 200. That probably covers a lot of what is produced, but it’s hard to know — or even guess — how many of the more than 50 million scientific articles that have been published are included here.

Archiving, too, has its challenges. Publishers have different ideas about the conditions under which they should join. And changing technologies introduce vulnerabilities, too; for LOCKSS and Portico to succeed, they have to have advanced technical skills and solutions.

Knowledge is vulnerable

The mere existence of the internet is not enough to prevent knowledge from disappearing. In fact, the internet actually introduces some new challenges for the preservation of research results. There are solutions but if they’re to work, we must engage and invest in them.

That’s worth doing, I believe, because research can lead us to a better future. But only if we hang onto it.

My interest in moving universities towards balance encompasses gender equality, the communication of scientific results, promoting research-based education and leadership development more generally. Read more


No Comments


I encourage you to republish this article online and in print, under the following conditions.

  • You have to credit the author.
  • If you’re republishing online, you must use our page view counter and link to its appearance here (included in the bottom of the HTML code), and include links from the story. In short, this means you should grab the html code below the post and use all of it.
  • Unless otherwise noted, all my pieces here have a Creative Commons Attribution licence -- CC BY 4.0 -- and you must follow the (extremely minimal) conditions of that license.
  • Keeping all this in mind, please take this work and spread it wherever it suits you to do so!