Friday, 21 December 2007

The science and the say-so

Hmmm. In a new blog posting pleading for a more scientific approach to studies on science publishing (subtext: Open Access proponents cook the books), Joe Esposito takes issue with a number of studies on Open Access on the grounds that they are not what he terms ‘scientific’. He also suggests that I am ‘behind’ these things, thus according me far more credit than I warrant, but since I am mentioned (twice, indeed) and a lot of people seem to be somewhat confused about the issues Mr Esposito raises, I thought it might help to clarify the situation.

The first contention is that OA proponents imply that librarians are stupid. The reason we must think them stupid, apparently, is that we say they won’t necessarily cancel subscriptions to journals whose contents can be obtained for free in Open Access form on the Web. But that misses the point: we say that cancellations won’t necessarily occur because that is what we observe, in real life. It is true that it is somewhat perplexing and seems to fly in the face of logic. Why would you, in these days of straitened circumstances for libraries, continue to pay for a journal whose articles are available for nothing?

To try to understand the way such buy/renew/cancel decisions are made, we have consulted a number of physics librarians on the matter in depth. Why physics librarians? Because they have a particular story to tell: one about how a whole host of journals in certain fields of physics have had their contents duplicated for free on the physics arXiv database for 15 years now, and yet the librarians haven’t cancelled their subscriptions to those journals. This is the one experiment that has actually been conducted so far – by the community itself, by way of everyday practice – on the effect of Open Access on journal cancellations. (Oh, and just to forestall the “Ah, but arXiv only contains preprints” chorus, the data show that more than half of the articles in arXiv are postprints, i.e. the peer-reviewed version.) . So, if all the articles are freely available elsewhere, why don’t physics librarians cancel their subscriptions? It certainly does seem very odd if put in those terms.

But, of course, those terms belie the complexity of the situation. There are a number of straightforward reasons, listed here and here, for preferring to continue to subscribe to journals – journals are more than just articles and contain other types of content that people want to read; they contain the final polished-up versions of articles whereas OA versions are simply the author’s final product; there is no guarantee that every article from a journal will be made OA by its author. Some of these may not hold up forever. We will start to see which journals have true added value – that is, something that customers will pay for – and which are just a collection of articles: the marketplace will reveal that. There are also other reasons, ones not so straightforward and certainly not so easy to describe. They are to do with allegiances to certain publishers, particularly specific society publishers who are viewed as ‘the good guys’ and thus worthy of loyalty; they are to do, partly, with the sorts of deals that publishers are prepared to offer in every individual case; and then they are very much to do with the views of faculty, without which no librarian makes a final decision on what to cut and what to reprieve. And faculty have very strong views on these things, not all of them based on logic or evidence. Even high-energy physicists have feelings.

The second point at issue is about a statistic. I was responsible for collecting the data behind this statistic and as it has generated quite a bit of comment over the years it seems appropriate to tell its story. We were commissioned by the JISC in 2003 to carry out a study on the attitudes and experiences of authors who had published in Open Access journals. We developed a list of questions for a survey, asking for comments from the project sponsor as usual but also, in our nice ecumenical way, going to key stakeholders and asking them for input too. One of them came up with a question that was not the kind of thing we normally ask, since it was in the form of ‘What would you do if…?’ Now, we usually avoid this kind of question in our surveys because we prefer to stick with questions about actual behaviour or experience, or actual preference or opinion. So, ‘Please indicate which of the following you have done’ is fine; so is ‘Please say how important each of the following things are to you?’ But that is usually as far as we go with the attitudinal stuff, since anything more complex needs to be supported by very cleverly-designed additional questions to ensure that you can interpret accurately what an apparent attitude means. I can’t remember ever asking any other questions of the form ‘What would you do if…?’ so this instance was almost certainly the only one in our history of client-commissioned surveys. Nonetheless, this stakeholder wanted it there and so we included it. The stakeholder was a publisher.

The question was ‘What would be your response if your employer or research funder required you to make your work Open Access’, and respondents were offered three options:
· I would comply willingly
· I would comply reluctantly
· I would not comply

The result was that 81% of respondents agreed with the first statement – they would comply willingly. 14% agreed with the second, and 5% agreed with the third.

Back to the present, and Mr Esposito’s argument. Rule number one for scientists: if you are going to consult the work of others and use it in your discussions, be accurate about what it shows. Unfortunately, and most unscientifically for someone arguing for a more scientific approach to studies on publishing, he reports my finding incorrectly, thus founding his argument about the significance of the finding on a bit of say-so. He says: “it was found that 81% of researchers say that they would comply with mandates. Now, what does this prove exactly? More than 81% of Americans comply for the most part with the U.S. Tax Code, but that is hardly indicative of support for the current administration or the way tax monies are spent. What it does reveal is a healthy respect for the punitive powers of The Man. In OA circles, however, a forecast compliance with a mandate is viewed as the equivalent of democratic support”.

Bong! What was actually found was that 81% of researchers say they would comply willingly with mandates. And that a further 14% would comply reluctantly. By my reckoning, that’s 95% who would comply with mandates, not 81%. And 81% would do it willingly. Willingly. Probably 95% of citizens do fill in their tax forms but I would doubt that 81% of them do it willingly.

But he’s right about the “So what?” This is just a datum point. Where’s the hypothesis; where’s the testing? Well, the hypothesis that derives from that datum point is, of course, ‘Where there are mandates, 95% of researchers would comply with them (81% willingly)’. And the testing? Carried out by Arthur Sale, who measured the amount of material being deposited in various Australian university repositories under different conditions of policy. And guess what it shows? Yes, that researchers under a mandate comply in the very ways predicted. I understand that another, larger-scale, exercise to measure the same thing over university repositories around the world is now underway, so we await the results to see what further there is to learn about this.

Now to the third point. Here it is: “A more complicated item, and one that is more susceptible to reasoned argument, is what is called the Open Access Advantage. No, this is not a frequent flier program but the notion that authors who work in OA formats are more likely to be cited than authors who work in proprietary or “toll-access” media. Superficially, this may appear to make sense; after all, if everyone can read an OA article, surely it has a better chance of getting cited than an article that has more limited distribution by virtue of the constraints imposed by subscription barriers. On the other hand, an article in the toll-access Lancet is much more likely to be cited than an article deposited in a no-name repository, with only Google keyword searching enabling the poor, already overburdened reader. Once again we find Alma Swan behind this [sic - AS]. The problem with the alleged Open Access Advantage is, first, it entirely ignores the overall marketing context of any particular work. The fact is that some OA venues are brilliantly marketed; I would point to the Public Library of Science in particular. But marketing is not a constant; it varies journal by journal, issue by issue, and article by article. Swan’s analysis does not take these variables into account.”

Oh dear. What a mix-up. Rule number two for scientists: make sure you understand the methodology before jumping to conclusions. My own PhD supervisor’s words ring in my ears again now as I think about this. “If you can’t replicate X’s experiment, try again, and then again. If it really can’t be replicated, then that fact should be reported, but it is most likely that the devil is in the detail. Check every aspect of their methodology and make sure your experimental conditions are exactly the same.” Every science student has the same mantra dinned into them. Is there a difference that could be material to my study? If so, what does it explain? From such contemplations, indeed, ‘eureka moments’ may arise. Hence the heavy focus on the Materials and Methods sections of scientific papers. Without the most careful examination of how an experiment was conducted, no scientific judgment can be arrived at as to the validity of the conclusions and the contribution to the field of a piece of work.

Unfortunately, Mr Esposito comes to his own conclusions about the Open Access Advantage without seemingly having read the studies that demonstrate it. He also appears misinformed about the authorship of studies in this area, by the way. I am flattered by the attention and attribution, but none of the studies were my work. Anyway, his thesis seems to be that the OA Advantage – the increase in citations that OA articles in general enjoy over those that are not Open Access – is all to do with which journal they are published in, and the marketing success thereof.

Another bit of say-so, I’m afraid. I am not aware of any studies that have been guilty of such sloppy design, and would be very surprised if anyone could point me at one that is. There have been several studies that have used good methodologies, including those by Kristin Antelman and Michael Kurtz and co-workers. But the one I normally use to support the statement that OA enhances citations is that done by Stevan Harnad and his groups in Montreal and Southampton, whose methodology is utterly sound. It is here for those who wish to make a proper critical appraisal of the work.

The way this study was carried out was this: a web robot crawled the web looking for scholarly articles that are available in full-text on an Open Access basis. Once one was located, the robot looked for another article – from the same issue of the same journal – with which to compare it. Two articles from the same issue of the same journal are as near-identical in characteristics as is possible to be, so this is a highly controlled experiment. The citations to such paired articles were compared and measured. The aggregated results for different scholarly disciplines showed that in every discipline there is an increase in citations for OA articles compared with citations for non-OA articles. The graph that illustrates the findings is in this article. They have to be explained by Open Access. There is no ‘marketing’ issue involved at all; and no comparing different journals, different fields (which have different citation patterns, yes); or different publishers. No comparing apples and oranges: just sticking with the good old Cox’s Orange Pippins (the finest apple in the world) and doing a properly controlled study.

Interestingly, studies of the OA advantage are moving on now to help us understand further the nature of citing behaviour. All along, we have acknowledged that the OA advantage would cease to be there once everything is OA. That’s just common sense – unless there is more to it than that, and so there seems to be. We have begun to disaggregate the OA advantage into its constituent elements, identifying at least 5 contributors. Not all of these will persist in a fully Open Access world. Nonetheless, one of the elements is the ‘early advantage’ – a citation advantage gained simply by making work available (to as many people as possible, obviously) as early as can be. Michael Kurtz has shown, working on the articles in the Astronomical Data Service, that Early Advantage is both important and persistent – persistent in an Open Access corpus, that is. So while we predict that the OA advantage will subside as the volume of OA literature increases, now that we are starting to understand better what contributes to this we do not expect that it will entirely disappear: it will continue to accrue to articles that are disseminated early in the publication process.

Does all this matter? Yes, it does. There may be people fighting for Open Access on purely philosophical grounds. There may be some fighting for it on a point of principle, or pure pragmatism in the face of journal prices. Others may be like G8 Summit protestors, against the might of globalised companies. In my experience, though, most of us are putting time and effort into the struggle because there is promise and reward for us all, in societal terms, from opened-up research. Better science could always have accrued from a more effective communication system, but now that the tools are available for doing science – in its broadest sense – in new ways on the Web, an open research basis is the essential foundation to put in place.

0 comments: