There can be no doubt that the situation for women has improved hugely since I started at UCL, 50 years ago. At that time women were not allowed in the senior common room. It’s improved even more since the 1930s (read about the attitude of the great statistician, Ronald Fisher, to Florence Nightinglale David).
Recently Williams & Ceci published data that suggest that young women no longer face barriers in job selection in the USA (though it will take 20 years before that feeds through to professor level). But no sooner than one was feeling optimistic, along comes Tim Hunt who caused a media storm by advocating male-only labs. I’ll say a bit about that case below.
First some very preliminary concrete proposals.
The job of emancipation is not yet completed. I’ve recently become a member of the Royal Society diversity committee, chaired by Uta Frith. That’s made me think more seriously about the evidence concerning the progress of women and of black and minority ethnic (BME) people in science, and what can be done about it. Here are some preliminary thoughts. They are my opinions, not those of the committee.
I suspect that much of the problem for women and BME results from over-competitiveness and perverse incentives that are imposed on researchers. That’s got progressively worse, and it affects men too. In fact it corrupts the entire scientific process.
One of the best writers on these topics is Peter Lawrence. He’s an eminent biologist who worked at the famous Lab for Molecular Biology in Cambridge, until he ‘retired’.
Here are three things by him that everyone should read.
From Lawrence (2003)
"Listen. All over the world scientists are fretting. It is night in London and Deborah Dormouse is unable to sleep. She can’t decide whether, after four weeks of anxious waiting, it would be counterproductive to call a Nature editor about her manuscript. In the sunlight in Sydney, Wayne Wombat is furious that his student’s article was rejected by Science and is taking revenge on similar work he is reviewing for Cell. In San Diego, Melissa Mariposa reads that her article submitted to Current Biology will be reconsidered, but only if it is cut in half. Against her better judgement, she steels herself to throw out some key data and oversimplify the conclusions— her postdoc needs this journal on his CV or he will lose a point in the Spanish league, and that job in Madrid will go instead to Mar Maradona."
"It is we older, well-established scientists who have to act to change things. We should make these points on committees for grants and jobs, and should not be so desperate to push our papers into the leading journals. We cannot expect younger scientists to endanger their future by making sacrifices for the common good, at least not before we do."
From Lawrence (2007)
“The struggle to survive in modern science, the open and public nature of that competition, and the advantages bestowed on those who are prepared to show off and to exploit others have acted against modest and gentle people of all kinds — yet there is no evidence, presumption or likelihood that less pushy people are less creative. As less aggressive people are predominantly women [14,15] it should be no surprise that, in spite of an increased proportion of women entering biomedical research as students, there has been little, if any, increase in the representation of women at the top . Gentle people of both sexes vote with their feet and leave a profession that they, correctly, perceive to discriminate against them . Not only do we lose many original researchers, I think science would flourish more in an understanding and empathetic workplace.”
From Lawrence (2011).
"There’s a reward system for building up a large group, if you can, and it doesn’t really matter how many of your group fail, as long as one or two succeed. You can build your career on their success".
Part of this pressure comes from university rankings. They are statistically-illiterate and serve no useful purpose, apart from making money for their publishers and providing vice-chancellors with an excuse to bullying staff in the interests of institutional willy-waving.
And part of the pressure arises from the money that comes with the REF. A recent survey gave rise to the comment
"Early career researchers overwhelmingly feel that the research excellence framework has created “a huge amount of pressure and anxiety, which impacts particularly on those at the bottom rung of the career ladder"
In fact the last REF was conducted quite sensibly (e.g. use of silly metrics was banned). The problem was that universities didn’t believe that the rules would be followed.
For example, academics in the Department of Medicine at Imperial College London were told (in 2007) they are expected to
“publish three papers per annum, at least one in a prestigious journal with an impact factor of at least five”.
And last year a 51-year-old academic with a good publication record was told that unless he raised £200,000 in grants in the next year, he’d be fired. There can be little doubt that this “performance management” contributed to his decision to commit suicide. And Imperial did nothing to remedy the policy after an internal investigation.
Crude financial targets for grant income should be condemned as defrauding the taxpayer (you are compelled to make your work as expensive as possible) As usual, women and BME suffer disproportionately from such bullying.
What can be done about this in practice?
I feel that some firm recommendations will be useful.
The Royal Society has already signed DORA, but, shockingly, only three universities in the UK have done so (Sussex, UCL and Manchester).
Another well-meaning initiative is The Concordat to Support the Career Development of Researchers. It’s written very much from the HR point of view and I’d argue that that’s part of the problem, not part of the solution.
For example it says
“3. Research managers should be required to participate in active performance management, including career development guidance”
That statement is meaningless without any definition of how performance management should be done. It’s quite clear that “performance management”, in the form of crude targets, was a large contributor to Stefan Grimm’s suicide.
The Concordat places great emphasis in training programmes, but ignores the fact that it’s doubtful whether diversity training works, and it may even have bad effects.
The Concordat is essentially meaningless in its present form.
I propose that all fellowships and grants should be awarded only to universities who have signed DORA and Athena Swan.
I have little faith that signing DORA, or the Concordat, will have much effect on the shop floor, but they do set a standard, and eventually, as with changes in the law, improvements in behaviour are effected.
But, as a check, It should be announced at the start that fellows and employees paid by grants will be asked directly whether or not these agreements have been honoured in practice.
Crude financial targets are imposed at one in six universities. Those who do that should be excluded from getting fellowships or grants, on the grounds that the process gives bad value to the funders (and taxpayer) and that it endangers objectivity.
Some thoughts in the Hunt affair
It’s now 46 years since I and Brian Woledge managed to get UCL’s senior common room, the Housman room, opened to women. That was 1969, and since then, I don’t think that I’ve heard any public statement that was so openly sexist as Tim Hunt’s now notorious speech in Korea.
On the Today Programme, Hunt himself said "What I said was quite accurately reported" and "I just wanted to be honest", so there’s no doubt that those are his views. He confirmed that the account that was first tweeted by Connie St Louis was accurate
Inevitably, there was a backlash from libertarians and conservatives. That was fuelled by a piece in today’s Observer, in which Hunt seems to regard himself as being victimised. My comment on the Observer piece sums up my views.
I was pretty shaken when I heard what Tim Hunt had said, all the more because I have recently become a member of the Royal Society’s diversity committee. When he talked about the incident on the Today programme on 10 June, it certainly didn’t sound like a joke to me. It seems that he carried on for more than 5 minutes in they same vein.
Everyone appreciates Hunt’s scientific work, but the views that he expressed about women are from the dark ages. It seemed to me, and to Dorothy Bishop, and to many others, that with views like that. Hunt should not play any part in selection or policy matters. The Royal Society moved with admirable speed to do that.
The views that were expressed are so totally incompatible with UCL’s values, so it was right that UCL too acted quickly. His job at UCL was an honorary one: he is retired and he was not deprived of his lab and his living, as some people suggested.
Although the initial reaction, from men as well as from women, was predictably angry, it very soon turned to humour, with the flood of #distractinglysexy tweets.
It would be a mistake to think that these actions were the work of PR people. They were thought to be just by everyone, female or male, who wants to improve diversity in science.
The episode is sad and disappointing. But the right things were done quickly.
Now Hunt can be left in peace to enjoy his retirement.
Look at it this way. If you were a young woman, applying for a fellowship in competition with men. what would you think if Tim Hunt were on the selection panel?
After all this fuss, we need to laugh.
Here is a clip from the BBC News Quiz, in which actor, Rebecca Front, gives her take on the affair.
Some great videos soon followed Hunt’s comments. Try these.
Nobel Scientist Tim Hunt Sparks a #Distractinglysexy Campaign
(via Jennifer Raff)
This video has some clips from an earlier one, from Suzi Gage “Science it’s a girl thing”.
15 June 2015
An update on what happened from UCL. From my knowledge of what happened, this is not PR spin. It’s true.
16 June 2015
There is an interview with Tim Hunt in Lab Times that’s rather revealing. This interview was published in April 2014, more than a year before the Korean speech. Right up to the penultimate paragraph we agree on just about everything, from the virtue of small groups to the iniquity of impact factors. But then right at the end we read this.
In your opinion, why are women still under-represented in senior positions in academia and funding bodies?
Hunt: I’m not sure there is really a problem, actually. People just look at the statistics. I dare, myself, think there is any discrimination, either for or against men or women. I think people are really good at selecting good scientists but I must admit the inequalities in the outcomes, especially at the higher end, are quite staggering. And I have no idea what the reasons are. One should start asking why women being under-represented in senior positions is such a big problem. Is this actually a bad thing? It is not immediately obvious for me… is this bad for women? Or bad for science? Or bad for society? I don’t know, it clearly upsets people a lot.
This suggests to me that the outburst on 8th June reflected opinions that Hunt has had for a while.
There has been quite a lot of discussion of Hunt’s track record. These tweets suggest it may not be blameless.
— Dr*T (@Dr_star_T) June 16, 2015
— Dr*T (@Dr_star_T) June 16, 2015
That's v interestting. It's been alleged tht nobody has grumbled. It seems thay have, but they daren't come forward https://t.co/AlUz0mAJbt
— David Colquhoun (@david_colquhoun) June 16, 2015
19 June 2015
Yesterday I was asked by the letters editor of the Times, Andrew Riley, to write a letter in response to a half-witted, anonymous, Times leading article. I dropped everything, and sent it. It was neither acknowledged nor published. Here it is [download pdf].
One of the few good outcomes of the sad affair of Tim Hunt is that it has brought to light the backwoodsmen who are eager to defend his actions, and to condemn UCL. The anonymous Times leader of 16 June was as good an example as any.
Some quotations from this letter were used by Tom Whipple in an article about Richard Dawkins surprising (to me) emergence as an unreconstructed backwoodsman.
18 June 2015
Adam Rutherford’s excellent Radio 4 programme, Inside Science, had an episode “Women Scientists on Sexism in Science". The last speaker was Uta Frith (who is chair of the Royal Society’s diversity committee). Her contribution started at about 23 min.
Listen to Uta Frith’s contribution.
" . . this over-competitiveness, and this incredible rush to publish fast, and publish in quantity rather than in quality, has been extremely detrimental for science, and it has been disproportionately bad, I think, for under-represented groups who don’t quite fit in to this over-competitive climate. So I am proposing something I like to call slow science . . . why is this necessary, to do this extreme measurement-driven, quantitative judgement of output, rather than looking at the actual quality"
That, I need hardly say, is music to my ears. Why not, for example, restrict the number of papers that an be submitted with fellowship applications to four (just as the REF did)?
21 June 2015
I’ve received a handful of letters, some worded in a quite extreme way, telling me I’m wrong. It’s no surprise that 100% of them are from men. Most are from more-or-less elderly men. A few are from senior men who run large groups. I have no way to tell whether their motive is a genuine wish to have freedom of speech at any price. Or whether their motives are less worthy: perhaps some of them are against anything that prevents postdocs working for 16 hours a day, for the glory of the boss. I just don’t know.
I’ve had far more letters saying that UCL did the right thing when it accepted Tim Hunt’s offer to resign from his non job at UCL. These letters are predominantly from young people, men as well as women. Almost all of them ask not to be identified in public. They are, unsurprisingly, scared to argue with the eight Nobel prizewinners who have deplored UCL’s action (without bothering to ascertain the facts). The fact that they are scared to speak out is hardly surprising. It’s part of the problem.
What you can do, if you don’t want to put your head above the public parapet. is simply to email the top people at UCL, in private. to express your support. All these email addresses are open to the public in UCL’s admirably open email directory.
Michael Arthur (provost): email@example.com
David Price (vice-provost research): firstname.lastname@example.org
Geraint Rees (Dean of the Faculty of Life Sciences): email@example.com
All these people have an excellent record on women in science, as illustrated by the response to Daily Mail’s appalling behaviour towards UCL astrophysicist, Hiranya Pereis.
26 June 2015
The sad matter of Tim Hunt is over, at last. The provost of UCL, Michael Arthur has now made a statement himself. Provost’s View: Women in Science is an excellent reiteration of UCL’s principles.
By way of celebration, here is the picture of the quad, taken on 23 March, 2003. It was the start of the second great march to try to stop the war in Iraq. I use it to introduce talks, as a reminder that there are more serious consequences of believing things that aren’t true than a handful of people taking sugar pills.
11 October 2015
In which I agree with Mary Collins
Long after this unpleasant row died down, it was brought back to life yesterday when I heard that Colin Blakemore had resigned as honorary president of the Association of British Science Writers (ABSW), on the grounds that that organisation had not been sufficiently hard on Connie St Louis, whose tweet initiated the whole affair. I’m not a member of the ABSW and I have never met St Louis, but I know Blakemore well and like him. Nevertheless it seems to me to be quite disproportionate for a famous elderly white man to take such dramatic headline-grabbing action because a young black women had exaggerated bits of her CV. Of course she shouldn’t have done that, but it everyone were punished so severely for "burnishing" their CV there would be a large number of people in trouble.
Blakemore’s own statement also suggested that her reporting was inaccurate (though it appears that he didn’t submitted a complaint to ABSW). As I have said above, I don’t think that this is true to any important extent. The gist of it was said was verified by others, and, most importantly, Hunt himself said "What I said was quite accurately reported" and "I just wanted to be honest". As far as I know, he hasn’t said anything since that has contradicted that view, which he gave straight after the event. The only change that I know of is that the words that were quoted turned out to have been followed by "Now, seriously", which can be interpreted as meaning that the sexist comments were intended as a joke. If it were not for earlier comments along the same lines, that might have been an excuse.
Yesterday, on twitter, I was asked by Mary Collins, Hunt’s wife, whether I thought he was misogynist. I said no and I don’t believe that it is. It’s true that I had used that word in a single tweet, long since deleted, and that was wrong. I suspect that I felt at the time that it sounded like a less harsh word than sexist, but it was the wrong word and I apologised for using it.
So do I believe that Tim Hunt is sexist? No I don’t. But his remarks both in Korea and earlier were undoubtedly sexist. Nevertheless, I don’t believe that, as a person, he suffers from ingrained sexism. He’s too nice for that. My interpretation is that (a) he’s so obsessive about his work that he has little time to think about political matters, and (b) he’s naive about the public image that he presents, and about how people will react to them. That’s a combination that I’ve seen before among some very eminent scientists.
In fact I find myself in almost complete agreement with Mary Collins, Hunt’s wife, when she said (I quote the Observer)
“And he is certainly not an old dinosaur. He just says silly things now and again.” “Collins clutches her head as Hunt talks. “It was an unbelievably stupid thing to say,” she says. “You can see why it could be taken as offensive if you didn’t know Tim. But really it was just part of his upbringing. He went to a single-sex school in the 1960s.”
Nevertheless, I think it’s unreasonable to think that comments such as those made in Korea (and earlier) would not have consequences, "naive" or not, "joke" or not, "upbringing" or not,
It’s really not hard to see why there were consequences. All you have to do is to imagine yourself as a woman, applying for a grant or fellowship, and realising that you’d be judged by Hunt. And if you think that the reaction was too harsh, imagine the same words being spoken with "blacks", or "Jews" substituted for "women". Of course I’m not suggesting for a moment that he’d have done this, but if anybody did, I doubt whether many people would have thought it was a good joke.
9 November 2015
An impressively detailed account of the Hunt affair has appeared. The gist can be inferred from the title: "Saving Tim Hunt
The campaign to exonerate Tim Hunt for his sexist remarks in Seoul is built on myths, misinformation, and spin". It was written by Dan Waddell (@danwaddell) and Paula Higgins (@justamusicprof). It is long and it’s impressively researched. it’s revealing to see the bits that Louise Mensch omitted from her quotations. I can’t disagree with its conclusion.
"In the end, the parable of Tim Hunt is indeed a simple one. He said something casually sexist, stupid and inappropriate which offended many of his audience. He then confirmed he said what he was reported to have said and apologised twice. The matter should have stopped there. Instead a concerted effort to save his name — which was not disgraced, nor his reputation as a scientist jeopardized — has rewritten history. Science is about truth. As this article has shown, we have seen very little of it from Hunt’s apologists — merely evasions, half-truths, distortions, errors and outright falsehoods.
8 April 2017
This late addition is to draw attention to a paper, wriiten by Edwin Boring in 1951, about the problems for the advancement of women in psychology. It’s remarkable reading and many of the roots of the problems have hardly changed today. (I chanced on the paper while looking for a paper that Boring wrote about P values in 1919.)
Here is a quotation from the conclusions.
“Here then is the Woman Problem as I see it. For the ICWP or anyone else to think that the problem.can be advanced toward solution by proving that professional women undergo more frustration and disappointment than professional men, and by calling then on the conscience of the profession to right a wrong, is to fail to see the problem clearly in all its psychosocial complexities. The problem turns on the mechanisms for prestige, and that prestige, which leads to honor and greatness and often to the large salaries, is not with any regularity proportional to professional merit or the social value of professional achievement. Nor is there any presumption that the possessor of prestige knows how to lead the good life. You may have to choose. Success is never whole, and, if you have it for this, you mayhave to give it up for that.”
The Higher Education Funding Council England (HEFCE) gives money to universities. The allocation that a university gets depends strongly on the periodical assessments of the quality of their research. Enormous amounts if time, energy and money go into preparing submissions for these assessments, and the assessment procedure distorts the behaviour of universities in ways that are undesirable. In the last assessment, four papers were submitted by each principal investigator, and the papers were read.
In an effort to reduce the cost of the operation, HEFCE has been asked to reconsider the use of metrics to measure the performance of academics. The committee that is doing this job has asked for submissions from any interested person, by June 20th.
This post is a draft for my submission. I’m publishing it here for comments before producing a final version for submission.
Draft submission to HEFCE concerning the use of metrics.
The first thing to note is that HEFCE is one of the original signatories of DORA (http://am.ascb.org/dora/ ). The first recommendation of that document is
:"Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions"
.Impact factors have been found, time after time, to be utterly inadequate as a way of assessing individuals, e.g. , . Even their inventor, Eugene Garfield, says that. There should be no need to rehearse yet again the details. If HEFCE were to allow their use, they would have to withdraw from the DORA agreement, and I presume they would not wish to do this.
Citation counting has several problems. Most of them apply equally to the H-index.
- Citations may be high because a paper is good and useful. They equally may be high because the paper is bad. No commercial supplier makes any distinction between these possibilities. It would not be in their commercial interests to spend time on that, but it’s critical for the person who is being judged. For example, Andrew Wakefield’s notorious 1998 paper, which gave a huge boost to the anti-vaccine movement had had 758 citations by 2012 (it was subsequently shown to be fraudulent).
- Citations take far too long to appear to be a useful way to judge recent work, as is needed for judging grant applications or promotions. This is especially damaging to young researchers, and to people (particularly women) who have taken a career break. The counts also don’t take into account citation half-life. A paper that’s still being cited 20 years after it was written clearly had influence, but that takes 20 years to discover,
- The citation rate is very field-dependent. Very mathematical papers are much less likely to be cited, especially by biologists, than more qualitative papers. For example, the solution of the missed event problem in single ion channel analysis [3,4] was the sine qua non for all our subsequent experimental work, but the two papers have only about a tenth of the number of citations of subsequent work that depended on them.
- Most suppliers of citation statistics don’t count citations of books or book chapters. This is bad for me because my only work with over 1000 citations is my 105 page chapter on methods for the analysis of single ion channels , which contained quite a lot of original work. It has had 1273 citations according to Google scholar but doesn’t appear at all in Scopus or Web of Science. Neither do the 954 citations of my statistics text book 
- There are often big differences between the numbers of citations reported by different commercial suppliers. Even for papers (as opposed to book articles) there can be a two-fold difference between the number of citations reported by Scopus, Web of Science and Google Scholar. The raw data are unreliable and commercial suppliers of metrics are apparently not willing to put in the work to ensure that their products are consistent or complete.
- Citation counts can be (and already are being) manipulated. The easiest way to get a large number of citations is to do no original research at all, but to write reviews in popular areas. Another good way to have ‘impact’ is to write indecisive papers about nutritional epidemiology. That is not behaviour that should command respect.
- Some branches of science are already facing something of a crisis in reproducibility . One reason for this is the perverse incentives which are imposed on scientists. These perverse incentives include the assessment of their work by crude numerical indices.
- “Gaming” of citations is easy. (If students do it it’s called cheating: if academics do it is called gaming.) If HEFCE makes money dependent on citations, then this sort of cheating is likely to take place on an industrial scale. Of course that should not happen, but it would (disguised, no doubt, by some ingenious bureaucratic euphemisms).
- For example, Scigen is a program that generates spoof papers in computer science, by stringing together plausible phases. Over 100 such papers have been accepted for publication. By submitting many such papers, the authors managed to fool Google Scholar in to awarding the fictitious author an H-index greater than that of Albert Einstein http://en.wikipedia.org/wiki/SCIgen
- The use of citation counts has already encouraged guest authorships and such like marginally honest behaviour. There is no way to tell with an author on a paper has actually made any substantial contribution to the work, despite the fact that some journals ask for a statement about contribution.
- It has been known for 17 years that citation counts for individual papers are not detectably correlated with the impact factor of the journal in which the paper appears . That doesn’t seem to have deterred metrics enthusiasts from using both. It should have done.
Given all these problems, it’s hard to see how citation counts could be useful to the REF, except perhaps in really extreme cases such as papers that get next to no citations over 5 or 10 years.
This has all the disadvantages of citation counting, but in addition it is strongly biased against young scientists, and against women. This makes it not worth consideration by HEFCE.
Given the role given to “impact” in the REF, the fact that altmetrics claim to measure impact might make them seem worthy of consideration at first sight. One problem is that the REF failed to make a clear distinction between impact on other scientists is the field and impact on the public.
Altmetrics measures an undefined mixture of both sorts if impact, with totally arbitrary weighting for tweets, Facebook mentions and so on. But the score seems to be related primarily to the trendiness of the title of the paper. Any paper about diet and health, however poor, is guaranteed to feature well on Twitter, as will any paper that has ‘penis’ in the title.
It’s very clear from the examples that I’ve looked at that few people who tweet about a paper have read more than the title. See Why you should ignore altmetrics and other bibliometric nightmares .
In most cases, papers were promoted by retweeting the press release or tweet from the journal itself. Only too often the press release is hyped-up. Metrics not only corrupt the behaviour of academics, but also the behaviour of journals. In the cases I’ve examined, reading the papers revealed that they were particularly poor (despite being in glamour journals): they just had trendy titles .
There could even be a negative correlation between the number of tweets and the quality of the work. Those who sell altmetrics have never examined this critical question because they ignore the contents of the papers. It would not be in their commercial interests to test their claims if the result was to show a negative correlation. Perhaps the reason why they have never tested their claims is the fear that to do so would reduce their income.
Furthermore you can buy 1000 retweets for $8.00 http://followers-and-likes.com/twitter/buy-twitter-retweets/ That’s outright cheating of course, and not many people would go that far. But authors, and journals, can do a lot of self-promotion on twitter that is totally unrelated to the quality of the work.
It’s worth noting that much good engagement with the public now appears on blogs that are written by scientists themselves, but the 3.6 million views of my blog do not feature in altmetrics scores, never mind Scopus or Web of Science. Altmetrics don’t even measure public engagement very well, never mind academic merit.
Evidence that metrics measure quality
Any metric would be acceptable only if it measured the quality of a person’s work. How could that proposition be tested? In order to judge this, one would have to take a random sample of papers, and look at their metrics 10 or 20 years after publication. The scores would have to be compared with the consensus view of experts in the field. Even then one would have to be careful about the choice of experts (in fields like alternative medicine for example, it would be important to exclude people whose living depended on believing in it). I don’t believe that proper tests have ever been done (and it isn’t in the interests of those who sell metrics to do it).
The great mistake made by almost all bibliometricians is that they ignore what matters most, the contents of papers. They try to make inferences from correlations of metric scores with other, equally dubious, measures of merit. They can’t afford the time to do the right experiment if only because it would harm their own “productivity”.
The evidence that metrics do what’s claimed for them is almost non-existent. For example, in six of the ten years leading up to the 1991 Nobel prize, Bert Sakmann failed to meet the metrics-based publication target set by Imperial College London, and these failures included the years in which the original single channel paper was published  and also the year, 1985, when he published a paper  that was subsequently named as a classic in the field . In two of these ten years he had no publications whatsoever. See also .
Application of metrics in the way that it’s been done at Imperial and also at Queen Mary College London, would result in firing of the most original minds.
Gaming and the public perception of science
Every form of metric alters behaviour, in such a way that it becomes useless for its stated purpose. This is already well-known in economics, where it’s know as Goodharts’s law http://en.wikipedia.org/wiki/Goodhart’s_law “"When a measure becomes a target, it ceases to be a good measure”. That alone is a sufficient reason not to extend metrics to science. Metrics have already become one of several perverse incentives that control scientists’ behaviour. They have encouraged gaming, hype, guest authorships and, increasingly, outright fraud .
The general public has become aware of this behaviour and it is starting to do serious harm to perceptions of all science. As long ago as 1999, Haerlin & Parr  wrote in Nature, under the title How to restore Public Trust in Science,
“Scientists are no longer perceived exclusively as guardians of objective truth, but also as smart promoters of their own interests in a media-driven marketplace.”
And in January 17, 2006, a vicious spoof on a Science paper appeared, not in a scientific journal, but in the New York Times. See http://www.dcscience.net/?p=156
The use of metrics would provide a direct incentive to this sort of behaviour. It would be a tragedy not only for people who are misjudged by crude numerical indices, but also a tragedy for the reputation of science as a whole.
There is no good evidence that any metric measures quality, at least over the short time span that’s needed for them to be useful for giving grants or deciding on promotions). On the other hand there is good evidence that use of metrics provides a strong incentive to bad behaviour, both by scientists and by journals. They have already started to damage the public perception of science of the honesty of science.
The conclusion is obvious. Metrics should not be used to judge academic performance.
What should be done?
If metrics aren’t used, how should assessment be done? Roderick Floud was president of Universities UK from 2001 to 2003. He’s is nothing if not an establishment person. He said recently:
“Each assessment costs somewhere between £20 million and £100 million, yet 75 per cent of the funding goes every time to the top 25 universities. Moreover, the share that each receives has hardly changed during the past 20 years.
It is an expensive charade. Far better to distribute all of the money through the research councils in a properly competitive system.”
The obvious danger of giving all the money to the Research Councils is that people might be fired solely because they didn’t have big enough grants. That’s serious -it’s already happened at Kings College London, Queen Mary London and at Imperial College. This problem might be ameliorated if there were a maximum on the size of grants and/or on the number of papers a person could publish, as I suggested at the open data debate. And it would help if univerities appointed vice-chancellors with a better long term view than most seem to have at the moment.
Aggregate metrics? It’s been suggested that the problems are smaller if one looks at aggregated metrics for a whole department. rather than the metrics for individual people. Clearly looking at departments would average out anomalies. The snag is that it wouldn’t circumvent Goodhart’s law. If the money depended on the aggregate score, it would still put great pressure on universities to recruit people with high citations, regardless of the quality of their work, just as it would if individuals were being assessed. That would weigh against thoughtful people (and not least women).
The best solution would be to abolish the REF and give the money to research councils, with precautions to prevent people being fired because their research wasn’t expensive enough. If politicians insist that the "expensive charade" is to be repeated, then I see no option but to continue with a system that’s similar to the present one: that would waste money and distract us from our job.
1. Seglen PO (1997) Why the impact factor of journals should not be used for evaluating research. British Medical Journal 314: 498-502. [Download pdf]
2. Colquhoun D (2003) Challenging the tyranny of impact factors. Nature 423: 479. [Download pdf]
3. Hawkes AG, Jalali A, Colquhoun D (1990) The distributions of the apparent open times and shut times in a single channel record when brief events can not be detected. Philosophical Transactions of the Royal Society London A 332: 511-538. [Get pdf]
4. Hawkes AG, Jalali A, Colquhoun D (1992) Asymptotic distributions of apparent open times and shut times in a single channel record allowing for the omission of brief events. Philosophical Transactions of the Royal Society London B 337: 383-404. [Get pdf]
5. Colquhoun D, Sigworth FJ (1995) Fitting and statistical analysis of single-channel records. In: Sakmann B, Neher E, editors. Single Channel Recording. New York: Plenum Press. pp. 483-587.
6. David Colquhoun on Google Scholar. Available: http://scholar.google.co.uk/citations?user=JXQ2kXoAAAAJ&hl=en17-6-2014
7. Ioannidis JP (2005) Why most published research findings are false. PLoS Med 2: e124.[full text]
8. Colquhoun D, Plested AJ Why you should ignore altmetrics and other bibliometric nightmares. Available: http://www.dcscience.net/?p=6369
9. Neher E, Sakmann B (1976) Single channel currents recorded from membrane of denervated frog muscle fibres. Nature 260: 799-802.
10. Colquhoun D, Sakmann B (1985) Fast events in single-channel currents activated by acetylcholine and its analogues at the frog muscle end-plate. J Physiol (Lond) 369: 501-557. [Download pdf]
11. Colquhoun D (2007) What have we learned from single ion channels? J Physiol 581: 425-427.[Download pdf]
13. Oransky, I. Retraction Watch. Available: http://retractionwatch.com/18-6-2014
14. Haerlin B, Parr D (1999) How to restore public trust in science. Nature 400: 499. 10.1038/22867 [doi].[Get pdf]
Some other posts on this topic
Using metrics to assess research quality By David Spiegelhalter “I am strongly against the suggestion that peer–review can in any way be replaced by bibliometrics”
1 July 2014
My brilliant statistical colleague, Alan Hawkes, not only laid the foundations for single molecule analysis (and made a career for me) . Before he got into that, he wrote a paper, Spectra of some self-exciting and mutually exciting point processes, (Biometrika 1971). In that paper he described a sort of stochastic process now known as a Hawkes process. In the simplest sort of stochastic process, the Poisson process, events are independent of each other. In a Hawkes process, the occurrence of an event affects the probability of another event occurring, so, for example, events may occur in clusters. Such processes were used for many years to describe the occurrence of earthquakes. More recently, it’s been noticed that such models are useful in finance, marketing, terrorism, burglary, social media, DNA analysis, and to describe invasive banana trees. The 1971 paper languished in relative obscurity for 30 years. Now the citation rate has shot threw the roof.
The papers about Hawkes processes are mostly highly mathematical. They are not the sort of thing that features on twitter. They are serious science, not just another ghastly epidemiological survey of diet and health. Anybody who cites papers of this sort is likely to be a real scientist. The surge in citations suggests to me that the 1971 paper was indeed an important bit of work (because the citations will be made by serious people). How does this affect my views about the use of citations? It shows that even highly mathematical work can achieve respectable citation rates, but it may take a long time before their importance is realised. If Hawkes had been judged by citation counting while he was applying for jobs and promotions, he’d probably have been fired. If his department had been judged by citations of this paper, it would not have scored well. It takes a long time to judge the importance of a paper and that makes citation counting almost useless for decisions about funding and promotion.
The letter is about the current buzzword, "research impact", a term that trips off the lips of every administrator and politician daily. Since much research is funded by the taxpayer, it seems reasonable to ask if it gives value for money. The best answer can be found in St Paul’s cathedral.
The plaque for Christopher Wren bears the epitaph
LECTOR, SI MONUMENTUM REQUIRIS, CIRCUMSPICE.
Reader, if you seek his memorial – look around you.
Much the same could be said for the impact of any science. Look at your refrigerator, your mobile phone, your computer, your central heating boiler, your house. Look at the X-ray machine and MRI machines in your hospital. Look at the aircraft that takes you on holiday. Look at your DVD player and laser surgery. Look, even, at the way you can turn a switch and light your room. Look at almost anything that you take for granted in your everyday life, They are all products of science; products, eventually, of the enlightenment.
BUT remember also that these wonderful products did not appear overnight. They evolved slowly over many decades or even centuries, and they evolved from work that, at the time, appeared to be mere idle curiosity. Electricity lies at the heart of everyday life. It took almost 200 years to get from Michael Faraday’s coils to your mobile phone. At the time, Faraday’s work seemed to politicians to be useless. Michael Faraday was made a fellow of the Royal Society in 1824.
. . . after Faraday was made a fellow of the Royal Society[,] the prime minister of the day asked what good this invention could be, and Faraday answered: “Why, Prime Minister, someday you can tax it.”
Whether this was really said is doubtful, but that hardly matters. It is the sort of remark made by politicians every day.
In May 2008, I read a review of ”The myths of Innovation” by Scott Berkun. The review seems to have vanished from the web, but I noted it in diary. These words should be framed on the wall of every politician and administrator. Here are some quotations.
“One myth that will disappoint most businesses is the idea that innovation can be managed. Actually, Berkun calls this one ‘Your boss knows more about innovation than you’. After all, he says, many people get their best ideas while they’re wandering in their bathrobes, filled coffee mug in hand, from the kitchen to their home PC on a day off rather than sitting in a cubicle in a suit during working hours. But professional managers can’t help it: their job is to control every variable as much as possible, and that includes innovation.”
“Creation is sloppy; discovery is messy; exploration is dangerous. What’s a manager to do?
The answer in general is to encourage curiosity and accept failure. Lots of failure.”
I commented at the time "What a pity that university managers are so far behind those of modern businesses. They seem to be totally incapable of understanding these simple truths. That is what happens when power is removed from people who know about research and put into the hands of lawyers, HR people, MBAs and failed researchers."
That is even more true two years later. The people who actually do research have been progressively disempowered. We are run by men in dark suits who mistake meetings for work. You have only to look at history to see that great discoveries arise from the curiosity of creative people, and that,. rarely, these ideas turn out to be of huge economic importance, many decades later.
The research impact plan, has been now renamed "Pathways to Impact". It means that scientists are being asked to explain the economic impact of their research before they have even got any results.
All that shows is how science is being run by dimwits who simply don’t understand how science works. This amounts to nothing less than being compelled to lie if you want any research funding. And, worse stiil, the pressure to lie comes not primarily from government, but from that curious breed of ex-scientists, failed scientists and non-scientists who control the Research Councils.
How much did RCUK pay for the silly logo?
We are being run by people who would have told Michael Faraday to stop messing about with wires and coils and to do something really useful, like inventing better leather washers for steam pumps.
Welcome to the third division. Brought to you be Research Counclls and politicians.
Here is the letter in The Times. It is worded slightly more diplomatically than my commentary. but will, no doubt, have just as little effect. What would the signatories know about science? Several off them don’t even wear black suits.
The governance of UK academic research today is delegated to a quangocracy comprising 11 funding and research councils, and to an additional body – Research Councils UK. Ill considered changes over the past few decades have transformed what was arguably the world’s most creative academic sector into one often described nowadays as merely competitive.
In their latest change, research councils introduce a new criterion for judging proposals – “Pathways to Impact” – against which individual researchers applying for funds must identify who might benefit from their proposed research and how they might benefit. Furthermore, the funding councils are planning to begin judging researchers’ departments in 2014 on the actual benefits achieved and to adjust their funding accordingly, thereby increasing pressure on researchers to deliver short-term benefits. However, we cannot understand why the quangocracy has ignored abundant evidence showing that the outcomes of high-quality research are impossible to predict.
We are mindful of the need to justify investment in academic research, but “Pathways to Impact” focuses on the predictable, leads to mediocrity, and reduces returns to the taxpayer. In our opinion as experienced researchers, few if any of the 20th century’s great discoveries and their huge economic stimuli could have happened if a policy of focussing on attractive short-term benefits had applied because great discoveries are always unpredicted. We therefore have an acutely serious problem.
Abolishing “Pathways to Impact” would not only save the expense of its burgeoning bureaucracy; it would also be a step towards liberating creativity and indicate that policy-makers have at last regained their capacity for world-class thinking.
Donald W Braben
John F Allen, Queen Mary, University of London;
Adam Curtis, Glasgow University;
Peter Lawrence FRS, University of Cambridge;
David Ray, BioAstral Limited;
Lewis Wolpert FRS, University College London;
Now cheer yourself up by reading Captain Cook’s Grant Application.
Scientists should sign the petition to help humanities too. See the Humanities and Social Sciences Matter web site.
Nobel view. 1. Andre Geim’s speech at Nobel banquet, 2010
"Human progress has always been driven by a sense of adventure and unconventional thinking. But amidst calls for “bread and circuses”, these virtues are often forgotten for the sake of cautiousness and political correctness that now rule the world. And we sink deeper and deeper from democracy into a state of mediocrity and even idiocracy. If you need an example, look no further than at research funding by the European Commission."
Nobel view. 2. Ahmed Zewail won the 1999 Nobel Prize in Chemistry. He serves on Barack Obama’s Council of Advisors on Science and Technology. He wrote in Nature
“Beware the urge to direct research too closely, says Nobel laureate Ahmed Zewail. History teaches us the value of free scientific inquisitiveness.”
“I have emphasized that without solid investment in science education and a fundamental science base, nations will not acquire the ground-breaking knowledge required to make discoveries and innovations that will shape their future.”
“Preserving knowledge is easy. Transferring knowledge is also easy. But making new knowledge is neither easy nor profitable in the short term. Fundamental research proves profitable in the long run, and, as importantly, it is a force that enriches the culture of any society with reason and basic truth.”
How many more people have to say this before the Research Councils take some notice?