The big buzz in the genomics twittersphere today is the release of over 30 publications on the human ENCODE project. This is a heroic achievement, both in terms of science and publishing, with many groundbreaking discoveries in biology and pioneering developments in publishing to be found in this set of papers. It is a triumph that all of these papers are freely available to read, and much is being said elsewhere in the blogosphere about the virtues of this project and the lessons learned from the publication of these data. I’d like to pick up here on an important point made by Daniel MacArthur in his post about the delays in the publication of these landmark papers that have arisen from the common practice of embargoing papers in genomics. To be clear, I am not talking about embargoing the use of data (which is also problematic), but embargoing the release of manuscripts that have been accepted for publication after peer review.
Many of us in the genomics community were aware of the progress the [ENCODE] project had been making via conference presentations and hallway conversations with participants. However, many other researchers who might have benefited from early access to the ENCODE data simply weren’t aware of its existence until today’s dramatic announcement – and as a result, these people are 6-12 months behind in their analyses.
It is important to emphasize that these publication delays are by design, and are driven primarily by the journals that set the publication schedules for major genomics papers. I saw first-hand how Nature sets the agenda for major genomics papers and their associated companion papers as part of the Drosophila 12 Genomes Project. This insider’s view left a distinctly bad taste in my mouth about how much control a single journal has over some of the most important community resource papers that are published in Biology. To give more people insight into this process, I am posting the agenda set by Nature for publication (in reverse chronological order) of the main Drosophila 12 Genomes paper, which went something like this:
7 Nov 2007: papers are published, embargo lifted on main/companion papers
28 Sept 2007: papers must be in production
21 Sept 2007: revised versions of papers received
17 Aug 2007: reviews are returned to authors
27 Jul 2007: papers are submitted
Not only was acceptance of the manuscript essentially assumed by the Nature editorial staff, the entire timeline was spelled out in advance, with an embargo built in to the process from the outset. Seeing this process unfold first hand was shocking to me, and has made me very skeptical of the power that the major journals have to dictate terms about how we, and other journals, publish our work.
Personally, I cannot see how this embargo system serves anyone in science other than the major journals. There is no valid scientific reason that major genome papers and their companions cannot be made available as online accepted preprints, as is now standard practice in the publishing industry. As scientists, we have a duty to ensure that the science we produce is released to the general public and community of scientists as rapidly and openly as possible. We do not have a duty to serve the agenda of a journal to increase their cachet or revenue stream. I am aware that we need to accept delays due to quality control via the peer review and publication process. But the delays due to the normal peer review process are bad enough, as ably discussed recently by Leslie Voshall. Why on earth would we accept that journals build in further unnecessary delays into the publication process?
This of course leads to the pertinent question: how harmful is this system of embargoes? Well, we can
estimate put an upper estimate on * this pretty easily from the submission/acceptance dates of the main and companion ENCODE papers (see table below). In general, most ENCODE papers were embargoed for a minimum of 2 months but some were embargoed for up to nearly 7 months. Ignoring (unfairly) the direct impact that these delays may have on the careers of PhD students and post-docs involved, something on the order of 112 months of access to these important papers have been lost to all scientists by this single embargo. Put another way, nearly up to * 10 years of access time to these papers has been collectively lost to science because of the ENCODE embargo. To the extent that these papers are crucial for understanding the human genome, and the consequences this knowledge has for human health, this decade lost to humanity is clearly unacceptable. Let us hope that the ENCODE project puts an end to the era of journal-mandated embargoes in genomics.
|DOI||Date Received||Date Accepted||Date published||Months in review||Months in embargo|
* Based on a converation on twitter with Chris Cole, I’ve revised this to be estimate to reflect the upper bound, rather than a point estimate of time lost to science.