University of Warwick

The reproducibility of Science. A meeting report.

Published April 14, 2015

There is a widespread belief that science is going through a crisis of reproducibility. A meeting was held to discuss the problem. It was organised by Academy of Medical Sciences, the Wellcome Trust, MRC and BBSRC, and It was chaired by Dorothy Bishop (of whose blog I’m a huge fan). It’s good to see that scientific establishment is beginning to take notice. Up to now it’s been bloggers who’ve been making the running. I hadn’t intended to write a whole post about it, but some sufficiently interesting points arose that I’ll have a go.

The first point to make is that, as far as I know, the “crisis” is limited to, or at least concentrated in, quite restricted areas of science. In particular, it doesn’t apply to the harder end of sciences. Nobody in physics, maths or chemistry talks about a crisis of reproducibility. I’ve heard very little about irreproducibility in electrophysiology (unless you include EEG work). I’ve spent most of my life working on single-molecule biophysics and I’ve never encountered serious problems with irreproducibility. It’s a small and specialist field so I think if I would have noticed if it were there. I’ve always posted on the web our analysis programs, and if anyone wants to spend a year re-analysing it they are very welcome to do so (though I have been asked only once).

The areas that seem to have suffered most from irreproducibility are experimental psychology, some areas of cell biology, imaging studies (fMRI) and genome studies. Clinical medicine and epidemiology have been bad too. Imaging and genome studies seem to be in a slightly different category from the others. They are largely statistical problems that arise from the huge number of comparisons that need to be done. Epidemiology problems stem largely from a casual approach to causality. The rest have no such excuses.

The meeting was biased towards psychology, perhaps because that’s an area that has had many problems. The solutions that were suggested were also biased towards that area. It’s hard to see some of them could be applied to electrophysiology for example.

There was, it has to be said, a lot more good intentions than hard suggestions. Pre-registration of experiments might help a bit in a few areas. I’m all for open access and open data, but doubt they will solve the problem either, though I hope they’ll become the norm (they always have been for me).

All the tweets from the meeting hve been collected as a Storify. The most retweeted comment was from Liz Wager

@SideviewLiz: Researchers are incentivised to publish, get grants, get promoted but NOT incentivised to be right! #reprosymp

This, I think, cuts to the heart if the problem. Perverse incentives, if sufficiently harsh, will inevitably lead to bad behaviour. Occasionally it will lead to fraud. It’s even led to (at least) two suicides. If you threaten people in their forties and fifties with being fired, and losing their house, because they don’t meet some silly metric, then of course people will cut corners. Curing that is very much more important than pre-registration, data-sharing and concordats, though the latter occupied far more of the time at the meeting.

The primary source of the problem is that there is not enough money for the number of people who want to do research (a matter that was barely mentioned). That leads to the unpalatable conclusion that the only way to cure the problem is to have fewer people competing for the money. That’s part of the reason that I suggested recently a two-stage university system. That’s unlikely to happen soon. So what else can be done in the meantime?

The responsibility for perverse incentives has to rest squarely on the shoulders of the senior academics and administrators who impose them. It is at this level that the solutions must be found. That was said, but not firmly enough. The problems are mostly created by the older generation It’s our fault.

IncidentalIy, I was not impressed by the fact that the Academy of Medical Sciences listed attendees with initials after peoples’ names. There were eight FRSs but I find it a bit embarrassing to be identified as one, as though it made any difference to the value of what I said.

It was suggested that courses in research ethics for young scientists would help. I disagree. In my experience, young scientists are honest and idealistic. The problems arise when their idealism is shattered by the bad example set by their elders. I’ve had a stream of young people in my office who want advice and support because they feel they are being pressured by their elders into behaviour which worries them. More than one of them have burst into tears because they feel that they have been bullied by PIs.

One talk that I found impressive was Ottloline Leyser who chaired the recent report on The Culture of Scientific Research in the UK, from the Nuffield Council on Bioethics. But I found that report to be bland and its recommendations, though well-meaning, unlikely to result in much change. The report was based on a relatively small, self-selected sample of 970 responses to a web survey, and on 15 discussion events. Relatively few people seem to have spent time filling in the text boxes, For example

“Of the survey respondents who provided a negative comment on the effects of competition in science, 24 out of 179 respondents (13 per cent) believe that high levels of competition between individuals discourage research collaboration and the sharing of data and methodologies.&rdquo:

Such numbers are too small to reach many conclusions, especially since the respondents were self-selected rather than selected at random (poor experimental design!). Nevertheless, the main concerns were all voiced. I was struck by

“Almost twice as many female survey respondents as male respondents raise issues related to career progression and the short term culture within UK research when asked which features of the research environment are having the most negative effect on scientists”

But no conclusions or remedies were put forward to remedy this problem. It was all put rather better, and much more frankly, some time ago by Peter Lawrence. I do have the impression that bloggers (including Dorothy Bishop) get to the heart of the problems much more directly than any official reports.

The Nuffield report seemed to me to put excessive trust in paper exercises, such as the “Concordat to Support the Career Development of Researchers”. The word “bullying” does not occur anywhere in the Nuffield document, despite the fact that it’s problem that’s been very widely discussed and a problem that’s critical for the problems of reproducibility. The Concordat (unlike the Nuffield report) does mention bullying.

"All managers of research should ensure that measures exist at every institution through which discrimination, bullying or harassment can be reported and addressed without adversely affecting the careers of innocent parties. "

That sounds good, but it’s very obvious that there are many places simply ignore it. All universities subscribe to the Concordat. But signing is as far as it goes in too many places. It was signed by Imperial College London, the institution with perhaps the worst record for pressurising its employees, but official reports would not dream of naming names or looking at publicly available documentation concerning bullying tactics. For that, you need bloggers.

On the first day, the (soon-to-depart) Dean of Medicine at Imperial, Dermot Kelleher, was there. He seemed a genial man, but he would say nothing about the death of Stefan Grimm. I find that attitude incomprehensible. He didn’t reappear on the second day of the meeting.

The San Francisco Declaration on Research Assessment (DORA) is a stronger statement than the Concordat, but its aims are more limited. DORA states that the impact factor is not to be used as a substitute “measure of the quality of individual research articles, or in hiring, promotion, or funding decisions”. That’s something that I wrote about in 2003, in Nature. In 2007 it was still rampant, including at Imperial College. It still is in many places. The Nuffield Council report says that DORA has been signed by “over 12,000 individuals and 500 organisations”, but fails to mention the fact that only three UK universities have signed up to DORA (oneof them, I’m happy to say, is UCL). That’s a pretty miserable record. And, of course, it remains to be seen whether the signatories really abide by the agreement. Most such worthy agreements are ignored on the shop floor.

The recommendations of the Nuffield Council report are all worthy, but they are bland and we’ll be lucky if they have much effect. For example

“Ensure that the track record of researchers is assessed broadly, without undue reliance on journal impact factors”

What on earth is “undue reliance”? That’s a far weaker statement than DORA. Why?

And

“Ensure researchers, particularly early career researchers, have a thorough grounding in research ethics”

In my opinion, what we should say to early career researchers is “avoid the bad example that’s set by your elders (but not always betters)”. It’s the older generation which has produced the problems and it’s unbecoming to put the blame on the young. It’s the late career researchers who are far more in need of a thorough grounding in research ethics than early-career researchers.

Although every talk was more or less interesting, the one I enjoyed most was the first one, by Marcus Munafo. It assessed the scale of the problem (though with a strong emphasis on psychology, plus some genetics and epidemiology), and he had good data on under-powered studies. It also made a fleeting mention of the problem of the false discovery rate. Since the meeting was essentially about the publication of results that aren’t true, I would have expected the statistical problem of the false discovery rate to have been given much more prominence than it was. Although Ioannidis’ now-famous paper “Why most published research is wrong” got the occasional mention, very little attention (apart from Munafo and Button) was given to the problems which he pointed out.

I’ve recently convinced myself that, if you declare that you’ve made a discovery when you observe P = 0.047 (as is almost universal in the biomedical literature) you’ll be wrong 30 – 70% of the time (see full paper, "An investigation of the false discovery rate and the misinterpretation of p-values".and simplified versions on Youtube and on this blog). If that’s right, then surely an important way to reduce the publication of false results is for journal editors to give better advice about statistics. This is a topic that was almost absent from the meeting. It’s also absent from the Nuffield Council report (the word “statistics” does not occur anywhere).

In summary, the meeting was very timely, and it was fun. But I ended up thinking it had a bit too much of preaching good intentions to the converted. It failed to grasp some of the nettles firmly enough. There was no mention of what’s happening at Imperial, or Warwick, or Queen Mary, or at Kings College London. Let’s hope that when it’s written up, the conclusion will be a bit less bland than those of most official reports.

It’s overdue that we set our house in order, because the public has noticed what’s going on. The New York Times was scathing in 2006. This week’s Economist said

"Modern scientists are doing too much trusting and not enough verifying -to the detriment of the whole of science, and of humanity.
Too many of the findings that fill the academic ether are the result of shoddy experiments or poor analysis"

"Careerism also encourages exaggeration and the cherrypicking of results."

This is what the public think of us. It’s time that vice-chancellors did something about it, rather than willy-waving about rankings.

Conclusions

After criticism of the conclusions of official reports, I guess that I have to make an attempt at recommendations myself. Here’s a first attempt.

The heart of the problem is money. Since the total amount of money is not likely to increase in the short term, the only solution is to decrease the number of applicants. This is a real political hot-potato, but unless it’s tackled the problem will persist. The most gentle way that I can think of doing this is to restrict research to a subset of universities. My proposal for a two stage university system might go some way to achieving this. It would result in better postgraduate education, and it would be more egalitarian for students. But of course universities that became “teaching only” would see (wrongly) as demotion, and it seems that UUK is unlikely to support any change to the status quo (except, of course, for increasing fees).
Smaller grants, smaller groups and fewer papers would benefit science.
Ban completely the use of impact factors and discourage use of all metrics. None has been shown to measure future quality. All increase the temptation to “game the system” (that’s the usual academic euphemism for what’s called cheating if an undergraduate does it).
“Performance management” is the method of choice for bullying academics. Don’t allow people to be fired because they don’t achieve arbitrary targets for publications or grant income. The criteria used at Queen Mary London, and Imperial, and Warwick and at Kings, are public knowledge. They are a recipe for employing spivs and firing Nobel Prize winners: the 1991 Nobel Laureate in Physiology or Medicine would have failed Imperial’s criteria in 6 years out of 10 years when he was doing the work which led to the prize.
Universities must learn that if you want innovation and creativity you have also to tolerate a lot of failure.
The ranking of universities by ranking businesses or by the REF encourages bad behaviour by encouraging vice-chancellors to improve their ranking, by whatever means they can. This is one reason for bullying behaviour. The rankings are totally arbitrary and a huge waste of money. I’m not saying that universities should be unaccountable to taxpayers. But all you have to do is to produce a list of publications to show that very few academics are not trying. It’s absurd to try to summarise a whole university in a single number. It’s simply statistical illiteracy
Don’t waste money on training courses in research ethics. Everyone already knows what’s honest and what’s dodgy (though a bit more statistics training might help with that). Most people want to do the honest thing, but few have the nerve to stick to their principles if the alternative is to lose your job and your home. Senior university people must stop behaving in that way.
University procedures for protecting the young are totally inadequate. A young student who reports bad behaviour of his seniors is still more likely to end up being fired than being congratulated (see, for example, a particularly bad case at the University of Sheffield). All big organisations close ranks to defend themselves when criticised. Even extreme cases, as when an employee commits suicide after being bullied, universities issue internal reports which blame nobody.
Universities must stop papering over the cracks when misbehaviour is discovered. It seems to be beyond the wit of PR people to realise that often it’s best (and always the cheapest) to put your hands up and say “sorry, we got that wrong”
There an urgent need to get rid of the sort of statistical illiteracy that allows P = 0.06 to be treated as failure and P = 0.04 as success. This is almost universal in biomedical papers, and given the hazards posed by the false discovery rate, could well be a major contribution to false claims. Journal editors need to offer much better statistical advice than is the case at the moment.

Follow-up

Tagged Alice Gast, bibliometrics, Dermot Kelleher, impact factor, Imperial College, irreproducibility, James Stirling, King's College London, metrics, Queen Mary, Queen Mary University of London, reproducibility, University of Warwick | 1 Comment

The University of Warwick brings itself into disrepute -four times. Watch your tone of voice.

Published April 8, 2015

Jump to follow-up

The University of Warwick seems determined to wrest the title of worst employer from Imperial College London and Queen Mary College London. In little over a year, Warwick has had four lots of disastrous publicity, all self-inflicted.

First came the affair of Thomas Docherty.

Thomas Docherty

Professor of English and Comparative Literature, Thomas Docherty was suspended in January 2014 by Warwick because of "inappropriate sighing", "making ironic comments" and "projecting negative body language". Not only was Docherty punished, but also his students.

"As well as being banned from campus, from the library, and from email contact with his colleagues, Docherty was prohibited from supervising his graduate students and from writing references. Indiscriminate, disproportionate, and unjust measures against the professor were also deeply unfair to his students."

Ludicrously, rather than brushing the matter aside, senior management at Warwick hired corporate lawyers to argue that his behaviour was grounds for dismissal.

That cost the university at least £43,000.

The story appeared in every UK newspaper and rapidly spread abroad. It must have been the most ham-fisted bit of PR ever. But rather than firing the HR department, The University of Warwick let the matter fester for a full nine months before reinstating Docherty in September 2014.

The university managed to get the worst possible outcome. The suspension provoked world-wide derision and in the end they admitted they’d been wrong. Jeremy Treglown, a professor emeritus of Warwick (and former editor of The Times Literary Supplement) described the episode as being like “something out of Kafka”.

And guess what, nobody was blamed and nobody resigned.

The firing people of doing cheap research

Warwick has followed the bad example set by Queen Mary College London, Kings College London and Imperial College London , If you don’t average an external grant income of at least £75,000 a year over the past four years, you job is at risk. Apart from its cruelty, the taxpayer is likely to take a dim view of academics being compelled to make research as expensive as possible. Some people need no more than a paper and pencil to do brilliant work. If you are one of them, don’t go to any of these universities.

It’s simply bad management. They shouldn’t have taken on so many people if they can’t pay the bills. Many universities took on extra staff in order to cheat on the REF. Now they have to cast some aside like worn-out old boots..

The tone of voice

Warwick University has very recently issued a document "Warwick tone of voice: Full guidelines. March 2015". It’s a sign of their ham-fisted management style that it wasn’t even hidden behind a password. They seem to be proud of it. Of course it provoked a storm of hilarity on social media. Documents like that are designed to instruct people not to give truthful opinions but to act as advertising agents for their university. The actual effect is, of course, exactly the opposite. They reduce the respect for the institution that issues such documents.

Here are some quotations (try not to laugh -you might get fired).

"What is tone of voice and why do we need a ‘Warwick’ tone of voice?
The tone of our language defines the way people respond to us. By writing in a tone that’s true to our brand, we can express what it is that makes University of Warwick unique."

"Our brand: defined by possibility

What is it that makes us unique? We’re a university with modern values and a formidable record of academic and commercial achievement — but not the only one. So what sets us apart?

The difference lies in our approach to everything we do. Warwick is a place that fundamentally rejects the notion of obstacles — a place where the starting point is always ‘anything is possible’. "

Then comes the common thread. It’s all to do with rankings.

“What if we raised our research profile to even higher levels of international excellence? Then we could be ranked as one of the world’s top fifty universities."

The people who sell university rankings (and the REF) have much to answer for,

There’s a good post about this fiasco, from people whose job is branding. "How not to write guidelines".

Outsourcing teaching

As if all this were not enough, on April 5th 2015, we heard that "Warwick Uni to outsource hourly paid academics to subsidiary". Universities already rely totally on people on people on short-term contracts. Most research is done by PhD students and post-doctoral students on three (or sometimes five) year contracts. They are supervised (not always very well) by people who spend most of their time writing grant applications. Science must be one of the most insecure jobs going.

Increasingly we are seeing casualisation of academics. A three year contract looks like luxury compared with being hired by the hour. It’s rapidly approaching zero-hours contracts for PhDs. In fact it’s reported that people hired by TeachHigher won’t even have a contract: "staff hired under TeachHigher will be working explicitly not on a contract, but rather, an ‘agreement’ ".

The organisation behind this is called TeachHigher. And guess who owns it? The University of Warwick. It is a subsidiary of the Warwick Employment Group which already runs several other employment agencies, including Unitemps which deals with cleaners, security and catering staff.

The university claims that it isn’t "outsourcing" because TeachHigher is part of the university. For now, anyway. It’s reported that "The university plans to turn the project into a commercial franchise, similar to another subsidiary used to pay cleaners and catering staff, it can sell to other institutions."

The Warwick students’ newspaper "spoke to a PhD student who was fired last year from a teaching job with Unitemps after participating in strike action, who felt one of the aims of creating TeachHigher may “to prevent collective action from taking place.”"

Bringing the university into disrepute is something for which you can be fired. The vice-chancellor, Nigel Thrift, has allowed Warwick to become a laughing stock four times in a single year. Perhaps it is time that the chair of Council, George Cox, did something about it?

Universities don’t have to be run like that. UCL isn’t, for one.

Follow-up

9 April 2015 It seems that TeachHigher was proposing to pay a lecturer £5 per hour. This may not be accurate but it’s certainly caused a stir.

Laurie Taylor, ever-topical, was on the Docherty case in Times Higher Education.

Riga, Riga, roses

I’ve nothing against Latvia per se, but I can’t in all honesty see any real parallels between a university in such a faraway and somewhat desolate place as Riga and our own delightful campus.”

That was how Jamie Targett, our Director of Corporate Affairs, responded to the news that the European Court of Human Rights had found that a professor at Riga Stradiņš University had been unfairly sacked for criticising senior management. University staff, the court ruled, must be free to criticise management without fear of dismissal or disciplinary action.

Targett “thoroughly rejected” the suggestion from our reporter Keith Ponting (30) that there might be “a parallel” between what happened at Riga and our own university’s decision to ban Professor Busby of our English Department from campus for nine months for a disciplinary offence.

This, insisted Targett, was a “wholly inappropriate parallel”. For whereas the Latvian professor had been disciplined for speaking out against “alleged nepotism, plagiarism, corruption and mismanagement” in his department, Professor Busby had been banned from campus and from contact with students and colleagues for nine months for the “far more heinous offence” of “sighing” during an appointments interview.

Targett said he “trusted that any fair-minded person, whether from Latvia or indeed the Outer Caucasus, would be able to see the essential difference in the scale of offence”.

10 April 2015

The London Review of Books has a rather similar piece, Mind Yout Tone, by Glen Newey.

"It’s tough to pick winners amid the textureless blather that has lately seeped from campus PR outfits".

"In a keen field, though, it’s Warwick’s drill-sheet that takes the jammie dodger".

17 April 2015

Anyone would have thought that Laurie Taylor had read this post. His inimitable Poppletonian column this week was entirely devoted to Warwick.

Nothing to laugh about!

16 APRIL 2015 | BY LAURIE TAYLOR

Our Director of Corporate Affairs, Jamie Targett, has roundly criticised all those members of the Poppleton academic staff who have responded to the new University of Warwick “Tone of Voice” guidelines with what he described as “wholly inappropriate sniggering”.

Targett said that he saw “nothing at all funny” in Warwick’s new insistence that its staff should always apply the “What if” linguistic principle in all their communications.

He particularly praised the manner in which the application of the What if principle helped to make communications optimistic, leaving “the reader to feel that you’re there to help them”. So instead of writing “This is only for”, Warwick staff under the influence of the What if principle would write “This is for everyone who”.

But there were many other advantages that could be derived from consistent application of What if. It also inclined writers to be “proactive”. So instead of writing “Your application was received”, Warwick staff imbued with the What if ethic would always write “We’ve read your application”.

Targett said that he also failed to find any humour whatsoever in the further What if insistence that academic staff should always avoid using such tentative words as “possibly”, “hopefully” or “maybe”. So, under the What if linguistic principle, staff would never write “We hope to become a top 50 world-ranked university” but always “Our aim is to become a top 50 world-ranked university”.

In what was being described as “an unexpected move”, Targett received support for his views on the What if principle from Mr Ted Odgers of our Department of Media and Cultural Studies, who thought that the principle made “particularly good sense” in the Warwick context. He went so far as to provide the following example of its application:

“What if the University of Warwick had not recently banned an academic from its campus for nothing more serious than sighing, projecting negative body language and making ironic comments when interviewing candidates for a job? And What if this ban had not been complemented with a ban on the said academic contacting his own undergraduates and tutoring his own PhD students and speaking to his former colleagues? And What if the whole case against the said academic had not then been pursued with the use of a team of high-powered barristers costing the university at least £43,000?”

If all these What ifs had been met, then, added Mr Odgers, Warwick might possibly, hopefully or maybe have managed to retain its former position as an institution that respected the principles of academic freedom.

Targett told The Poppletonian that while he appreciated Mr Odgers’ application of the What if principle, he felt that it did not “at some points” fully capture the essence of its guidelines.

Tagged Academia, HR, HR bollocks, Nigel Thrift, Thomas Docherty, Universities, University of Warwick | 3 Comments

The reproducibility of Science. A meeting report.

Follow-up

Like this:

The University of Warwick brings itself into disrepute -four times. Watch your tone of voice.

Follow-up

Like this:

University of Warwick

The reproducibility of Science. A meeting report.

Follow-up

Share this:

Like this:

The University of Warwick brings itself into disrepute -four times. Watch your tone of voice.

Follow-up

Share this:

Like this: