LOB-vs
Download Lectures on Biostatistics (1971).
Corrected and searchable version of Google books edition

Download review of Lectures on Biostatistics (THES, 1973).

Latest Tweets
Categories
Archives

RCT

‘We know little about the effect of diet on health. That’s why so much is written about it’. That is the title of a post in which I advocate the view put by John Ioannidis that remarkably little is known about the health effects if individual nutrients. That ignorance has given rise to a vast industry selling advice that has little evidence to support it.

The 2016 Conference of the so-called "College of Medicine" had the title "Food, the Forgotten Medicine". This post gives some background information about some of the speakers at this event. I’m sorry it appears to be too ad hominem, but the only way to judge the meeting is via the track record of the speakers.

com0

com1

Quite a lot has been written here about the "College of Medicine". It is the direct successor of the Prince of Wales’ late, unlamented, Foundation for Integrated Health. But unlike the latter, its name is disguises its promotion of quackery. Originally it was going to be called the “College of Integrated Health”, but that wasn’t sufficently deceptive so the name was dropped.

For the history of the organisation, see

The new “College of Medicine” arising from the ashes of the Prince’s Foundation for Integrated Health

Don’t be deceived. The new “College of Medicine” is a fraud and delusion

The College of Medicine is in the pocket of Crapita Capita. Is Graeme Catto selling out?

The conference programme (download pdf) is a masterpiece of bait and switch. It is a mixture of very respectable people, and outright quacks. The former are invited to give legitimacy to the latter. The names may not be familiar to those who don’t follow the antics of the magic medicine community, so here is a bit of information about some of them.

The introduction to the meeting was by Michael Dixon and Catherine Zollman, both veterans of the Prince of Wales Foundation, and both devoted enthusiasts for magic medicne. Zollman even believes in the battiest of all forms of magic medicine, homeopathy (download pdf), for which she totally misrepresents the evidence. Zollman works now at the Penny Brohn centre in Bristol. She’s also linked to the "Portland Centre for integrative medicine" which is run by Elizabeth Thompson, another advocate of homeopathy. It came into being after NHS Bristol shut down the Bristol Homeopathic Hospital, on the very good grounds that it doesn’t work.

Now, like most magic medicine it is privatised. The Penny Brohn shop will sell you a wide range of expensive and useless "supplements". For example, Biocare Antioxidant capsules at £37 for 90. Biocare make several unjustified claims for their benefits. Among other unnecessary ingredients, they contain a very small amount of green tea. That’s a favourite of "health food addicts", and it was the subject of a recent paper that contains one of the daftest statistical solecisms I’ve ever encountered

"To protect against type II errors, no corrections were applied for multiple comparisons".

If you don’t understand that, try this paper.
The results are almost certainly false positives, despite the fact that it appeared in Lancet Neurology. It’s yet another example of broken peer review.

It’s been know for decades now that “antioxidant” is no more than a marketing term, There is no evidence of benefit and large doses can be harmful. This obviously doesn’t worry the College of Medicine.

Margaret Rayman was the next speaker. She’s a real nutritionist. Mixing the real with the crackpots is a standard bait and switch tactic.

Eleni Tsiompanou, came next. She runs yet another private "wellness" clinic, which makes all the usual exaggerated claims. She seems to have an obsession with Hippocrates (hint: medicine has moved on since then). Dr Eleni’s Joy Biscuits may or may not taste good, but their health-giving properties are make-believe.

Andrew Weil, from the University of Arizona
gave the keynote address. He’s described as "one of the world’s leading authorities on Nutrition and Health". That description alone is sufficient to show the fantasy land in which the College of Medicine exists. He’s a typical supplement salesman, presumably very rich. There is no excuse for not knowing about him. It was 1988 when Arnold Relman (who was editor of the New England Journal of Medicine) wrote A Trip to Stonesville: Some Notes on Andrew Weil, M.D..

“Like so many of the other gurus of alternative medicine, Weil is not bothered by logical contradictions in his argument, or encumbered by a need to search for objective evidence.”

This blog has mentioned his more recent activities, many times.

Alex Richardson, of Oxford Food and Behaviour Research (a charity, not part of the university) is an enthusiast for omega-3, a favourite of the supplement industry, She has published several papers that show little evidence of effectiveness. That looks entirely honest. On the other hand, their News section contains many links to the notorious supplement industry lobby site, Nutraingredients, one of the least reliable sources of information on the web (I get their newsletter, a constant source of hilarity and raised eyebrows). I find this worrying for someone who claims to be evidence-based. I’m told that her charity is funded largely by the supplement industry (though I can’t find any mention of that on the web site).

Stephen Devries was a new name to me. You can infer what he’s like from the fact that he has been endorsed byt Andrew Weil, and that his address is "Institute for Integrative Cardiology" ("Integrative" is the latest euphemism for quackery). Never trust any talk with a title that contains "The truth about". His was called "The scientific truth about fats and sugars," In a video, he claims that diet has been shown to reduce heart disease by 70%. which gives you a good idea of his ability to assess evidence. But the claim doubtless helps to sell his books.

Prof Tim Spector, of Kings College London, was next. As far as I know he’s a perfectly respectable scientist, albeit one with books to sell, But his talk is now online, and it was a bit like a born-again microbiome enthusiast. He seemed to be too impressed by the PREDIMED study, despite it’s statistical unsoundness, which was pointed out by Ioannidis. Little evidence was presented, though at least he was more sensible than the audience about the uselessness of multivitamin tablets.

Simon Mills talked on “Herbs and spices. Using Mother Nature’s pharmacy to maintain health and cure illness”. He’s a herbalist who has featured here many times. I can recommend especially his video about Hot and Cold herbs as a superb example of fantasy science.

Annie Anderson, is Professor of Public Health Nutrition and
Founder of the Scottish Cancer Prevention Network. She’s a respectable nutritionist and public health person, albeit with their customary disregard of problems of causality.

Patrick Holden is chair of the Sustainable Food Trust. He promotes "organic farming". Much though I dislike the cruelty of factory farms, the "organic" industry is largely a way of making food more expensive with no health benefits.

The Michael Pittilo 2016 Student Essay Prize was awarded after lunch. Pittilo has featured frequently on this blog as a result of his execrable promotion of quackery -see, in particular, A very bad report: gamma minus for the vice-chancellor.

Nutritional advice for patients with cancer. This discussion involved three people.
Professor Robert Thomas, Consultant Oncologist, Addenbrookes and Bedford Hospitals, Dr Clare Shaw, Consultant Dietitian, Royal Marsden Hospital and Dr Catherine Zollman, GP and Clinical Lead, Penny Brohn UK.

Robert Thomas came to my attention when I noticed that he, as a regular cancer consultant had spoken at a meeting of the quack charity, “YestoLife”. When I saw he was scheduled tp speak at another quack conference. After I’d written to him to point out the track records of some of the people at the meeting, he withdrew from one of them. See The exploitation of cancer patients is wicked. Carrot juice for lunch, then die destitute. The influence seems to have been temporary though. He continues to lend respectability to many dodgy meetings. He edits the Cancernet web site. This site lends credence to bizarre treatments like homeopathy and crystal healing. It used to sell hair mineral analysis, a well-known phony diagnostic method the main purpose of which is to sell you expensive “supplements”. They still sell the “Cancer Risk Nutritional Profile”. for £295.00, despite the fact that it provides no proven benefits.

Robert Thomas designed a food "supplement", Pomi-T: capsules that contain Pomegranate, Green tea, Broccoli and Curcumin. Oddly, he seems still to subscribe to the antioxidant myth. Even the supplement industry admits that that’s a lost cause, but that doesn’t stop its use in marketing. The one randomised trial of these pills for prostate cancer was inconclusive. Prostate Cancer UK says "We would not encourage any man with prostate cancer to start taking Pomi-T food supplements on the basis of this research". Nevertheless it’s promoted on Cancernet.co.uk and widely sold. The Pomi-T site boasts about the (inconclusive) trial, but says "Pomi-T® is not a medicinal product".

There was a cookery demonstration by Dale Pinnock "The medicinal chef" The programme does not tell us whether he made is signature dish "the Famous Flu Fighting Soup". Needless to say, there isn’t the slightest reason to believe that his soup has the slightest effect on flu.

In summary, the whole meeting was devoted to exaggerating vastly the effect of particular foods. It also acted as advertising for people with something to sell. Much of it was outright quackery, with a leavening of more respectable people, a standard part of the bait-and-switch methods used by all quacks in their attempts to make themselves sound respectable. I find it impossible to tell how much the participants actually believe what they say, and how much it’s a simple commercial drive.

The thing that really worries me is why someone like Phil Hammond supports this sort of thing by chairing their meetings (as he did for the "College of Medicine’s" direct predecessor, the Prince’s Foundation for Integrated Health. His defence of the NHS has made him something of a hero to me. He assured me that he’d asked people to stick to evidence. In that he clearly failed. I guess they must pay well.

Follow-up

Jump to follow-up

“Statistical regression to the mean predicts that patients selected for abnormalcy will, on the average, tend to improve. We argue that most improvements attributed to the placebo effect are actually instances of statistical regression.”

“Thus, we urge caution in interpreting patient improvements as causal effects of our actions and should avoid the conceit of assuming that our personal presence has strong healing powers.”

McDonald et al., (1983)

In 1955, Henry Beecher published "The Powerful Placebo". I was in my second undergraduate year when it appeared. And for many decades after that I took it literally, They looked at 15 studies and found that an average 35% of them got "satisfactory relief" when given a placebo. This number got embedded in pharmacological folk-lore. He also mentioned that the relief provided by placebo was greatest in patients who were most ill.

Consider the common experiment in which a new treatment is compared with a placebo, in a double-blind randomised controlled trial (RCT). It’s common to call the responses measured in the placebo group the placebo response. But that is very misleading, and here’s why.

The responses seen in the group of patients that are treated with placebo arise from two quite different processes. One is the genuine psychosomatic placebo effect. This effect gives genuine (though small) benefit to the patient. The other contribution comes from the get-better-anyway effect. This is a statistical artefact and it provides no benefit whatsoever to patients. There is now increasing evidence that the latter effect is much bigger than the former.

How can you distinguish between real placebo effects and get-better-anyway effect?

The only way to measure the size of genuine placebo effects is to compare in an RCT the effect of a dummy treatment with the effect of no treatment at all. Most trials don’t have a no-treatment arm, but enough do that estimates can be made. For example, a Cochrane review by Hróbjartsson & Gøtzsche (2010) looked at a wide variety of clinical conditions. Their conclusion was:

“We did not find that placebo interventions have important clinical effects in general. However, in certain settings placebo interventions can influence patient-reported outcomes, especially pain and nausea, though it is difficult to distinguish patient-reported effects of placebo from biased reporting.”

In some cases, the placebo effect is barely there at all. In a non-blind comparison of acupuncture and no acupuncture, the responses were essentially indistinguishable (despite what the authors and the journal said). See "Acupuncturists show that acupuncture doesn’t work, but conclude the opposite"

So the placebo effect, though a real phenomenon, seems to be quite small. In most cases it is so small that it would be barely perceptible to most patients. Most of the reason why so many people think that medicines work when they don’t isn’t a result of the placebo response, but it’s the result of a statistical artefact.

Regression to the mean is a potent source of deception

The get-better-anyway effect has a technical name, regression to the mean. It has been understood since Francis Galton described it in 1886 (see Senn, 2011 for the history). It is a statistical phenomenon, and it can be treated mathematically (see references, below). But when you think about it, it’s simply common sense.

You tend to go for treatment when your condition is bad, and when you are at your worst, then a bit later you’re likely to be better, The great biologist, Peter Medawar comments thus.

"If a person is (a) poorly, (b) receives treatment intended to make him better, and (c) gets better, then no power of reasoning known to medical science can convince him that it may not have been the treatment that restored his health"
(Medawar, P.B. (1969:19). The Art of the Soluble: Creativity and originality in science. Penguin Books: Harmondsworth).

This is illustrated beautifully by measurements made by McGorry et al., (2001). Patients with low back pain recorded their pain (on a 10 point scale) every day for 5 months (they were allowed to take analgesics ad lib).

The results for four patients are shown in their Figure 2. On average they stay fairly constant over five months, but they fluctuate enormously, with different patterns for each patient. Painful episodes that last for 2 to 9 days are interspersed with periods of lower pain or none at all. It is very obvious that if these patients had gone for treatment at the peak of their pain, then a while later they would feel better, even if they were not actually treated. And if they had been treated, the treatment would have been declared a success, despite the fact that the patient derived no benefit whatsoever from it. This entirely artefactual benefit would be the biggest for the patients that fluctuate the most (e.g this in panels a and d of the Figure).

fig2
Figure 2 from McGorry et al, 2000. Examples of daily pain scores over a 6-month period for four participants. Note: Dashes of different lengths at the top of a figure designate an episode and its duration.

The effect is illustrated well by an analysis of 118 trials of treatments for non-specific low back pain (NSLBP), by Artus et al., (2010). The time course of pain (rated on a 100 point visual analogue pain scale) is shown in their Figure 2. There is a modest improvement in pain over a few weeks, but this happens regardless of what treatment is given, including no treatment whatsoever.

artus2

FIG. 2 Overall responses (VAS for pain) up to 52-week follow-up in each treatment arm of included trials. Each line represents a response line within each trial arm. Red: index treatment arm; Blue: active treatment arm; Green: usual care/waiting list/placebo arms. ____: pharmacological treatment; – – – -: non-pharmacological treatment; . . .. . .: mixed/other. 

The authors comment

"symptoms seem to improve in a similar pattern in clinical trials following a wide variety of active as well as inactive treatments.", and "The common pattern of responses could, for a large part, be explained by the natural history of NSLBP".

In other words, none of the treatments work.

This paper was brought to my attention through the blog run by the excellent physiotherapist, Neil O’Connell. He comments

"If this finding is supported by future studies it might suggest that we can’t even claim victory through the non-specific effects of our interventions such as care, attention and placebo. People enrolled in trials for back pain may improve whatever you do. This is probably explained by the fact that patients enrol in a trial when their pain is at its worst which raises the murky spectre of regression to the mean and the beautiful phenomenon of natural recovery."

O’Connell has discussed the matter in recent paper, O’Connell (2015), from the point of view of manipulative therapies. That’s an area where there has been resistance to doing proper RCTs, with many people saying that it’s better to look at “real world” outcomes. This usually means that you look at how a patient changes after treatment. The hazards of this procedure are obvious from Artus et al.,Fig 2, above. It maximises the risk of being deceived by regression to the mean. As O’Connell commented

"Within-patient change in outcome might tell us how much an individual’s condition improved, but it does not tell us how much of this improvement was due to treatment."

In order to eliminate this effect it’s essential to do a proper RCT with control and treatment groups tested in parallel. When that’s done the control group shows the same regression to the mean as the treatment group. and any additional response in the latter can confidently attributed to the treatment. Anything short of that is whistling in the wind.

Needless to say, the suboptimal methods are most popular in areas where real effectiveness is small or non-existent. This, sad to say, includes low back pain. It also includes just about every treatment that comes under the heading of alternative medicine. Although these problems have been understood for over a century, it remains true that

"It is difficult to get a man to understand something, when his salary depends upon his not understanding it."
Upton Sinclair (1935)

Responders and non-responders?

One excuse that’s commonly used when a treatment shows only a small effect in proper RCTs is to assert that the treatment actually has a good effect, but only in a subgroup of patients ("responders") while others don’t respond at all ("non-responders"). For example, this argument is often used in studies of anti-depressants and of manipulative therapies. And it’s universal in alternative medicine.

There’s a striking similarity between the narrative used by homeopaths and those who are struggling to treat depression. The pill may not work for many weeks. If the first sort of pill doesn’t work try another sort. You may get worse before you get better. One is reminded, inexorably, of Voltaire’s aphorism "The art of medicine consists in amusing the patient while nature cures the disease".

There is only a handful of cases in which a clear distinction can be made between responders and non-responders. Most often what’s observed is a smear of different responses to the same treatment -and the greater the variability, the greater is the chance of being deceived by regression to the mean.

For example, Thase et al., (2011) looked at responses to escitalopram, an SSRI antidepressant. They attempted to divide patients into responders and non-responders. An example (Fig 1a in their paper) is shown.

Thase fig 1a

The evidence for such a bimodal distribution is certainly very far from obvious. The observations are just smeared out. Nonetheless, the authors conclude

"Our findings indicate that what appears to be a modest effect in the grouped data – on the boundary of clinical significance, as suggested above – is actually a very large effect for a subset of patients who benefited more from escitalopram than from placebo treatment. "

I guess that interpretation could be right, but it seems more likely to be a marketing tool. Before you read the paper, check the authors’ conflicts of interest.

The bottom line is that analyses that divide patients into responders and non-responders are reliable only if that can be done before the trial starts. Retrospective analyses are unreliable and unconvincing.

Some more reading

Senn, 2011 provides an excellent introduction (and some interesting history). The subtitle is

"Here Stephen Senn examines one of Galton’s most important statistical legacies – one that is at once so trivial that it is blindingly obvious, and so deep that many scientists spend their whole career being fooled by it."

The examples in this paper are extended in Senn (2009), “Three things that every medical writer should know about statistics”. The three things are regression to the mean, the error of the transposed conditional and individual response.

You can read slightly more technical accounts of regression to the mean in McDonald & Mazzuca (1983) "How much of the placebo effect is statistical regression" (two quotations from this paper opened this post), and in Stephen Senn (2015) "Mastering variation: variance components and personalised medicine". In 1988 Senn published some corrections to the maths in McDonald (1983).

The trials that were used by Hróbjartsson & Gøtzsche (2010) to investigate the comparison between placebo and no treatment were looked at again by Howick et al., (2013), who found that in many of them the difference between treatment and placebo was also small. Most of the treatments did not work very well.

Regression to the mean is not just a medical deceiver: it’s everywhere

Although this post has concentrated on deception in medicine, it’s worth noting that the phenomenon of regression to the mean can cause wrong inferences in almost any area where you look at change from baseline. A classical example concern concerns the effectiveness of speed cameras. They tend to be installed after a spate of accidents, and if the accident rate is particularly high in one year it is likely to be lower the next year, regardless of whether a camera had been installed or not. To find the true reduction in accidents caused by installation of speed cameras, you would need to choose several similar sites and allocate them at random to have a camera or no camera. As in clinical trials. looking at the change from baseline can be very deceptive.

Statistical postscript

Lastly, remember that it you avoid all of these hazards of interpretation, and your test of significance gives P = 0.047. that does not mean you have discovered something. There is still a risk of at least 30% that your ‘positive’ result is a false positive. This is explained in Colquhoun (2014),"An investigation of the false discovery rate and the misinterpretation of p-values". I’ve suggested that one way to solve this problem is to use different words to describe P values: something like this.

P > 0.05 very weak evidence
P = 0.05 weak evidence: worth another look
P = 0.01 moderate evidence for a real effect
P = 0.001 strong evidence for real effect

But notice that if your hypothesis is implausible, even these criteria are too weak. For example, if the treatment and placebo are identical (as would be the case if the treatment were a homeopathic pill) then it follows that 100% of positive tests are false positives.

Follow-up

12 December 2015

It’s worth mentioning that the question of responders versus non-responders is closely-related to the classical topic of bioassays that use quantal responses. In that field it was assumed that each participant had an individual effective dose (IED). That’s reasonable for the old-fashioned LD50 toxicity test: every animal will die after a sufficiently big dose. It’s less obviously right for ED50 (effective dose in 50% of individuals). The distribution of IEDs is critical, but it has very rarely been determined. The cumulative form of this distribution is what determines the shape of the dose-response curve for fraction of responders as a function of dose. Linearisation of this curve, by means of the probit transformation used to be a staple of biological assay. This topic is discussed in Chapter 10 of Lectures on Biostatistics. And you can read some of the history on my blog about Some pharmacological history: an exam from 1959.

Jump to follow-up

One of my scientific heroes is Bernard Katz. The closing words of his inaugural lecture, as professor of biophysics at UCL, hang on the wall of my office as a salutory reminder to refrain from talking about ‘how the brain works’. After speaking about his discoveries about synaptic transmission, he ended thus.

"My time is up and very glad I am, because I have been leading myself right up to a domain on which I should not dare to trespass, not even in an Inaugural Lecture. This domain contains the awkward problems of mind and matter about which so much has been talked and so little can be said, and having told you of my pedestrian disposition, I hope you will give me leave to stop at this point and not to hazard any further guesses."

BK
Drawing ©Jenny Hersson-Ringskog

The question of what to eat for good health is truly a topic about "which so much has been talked and so little can be said"

That was emphasized yet again by an editorial in the British Medical Journal written by my favourite epidemiologist. John Ioannidis. He has been at the forefront of debunking hype. Its title is “Implausible results in human nutrition research” (BMJ, 2013;347:f6698.
Get pdf
).

The gist is given by the memorable statement

"Almost every single nutrient imaginable has peer reviewed publications associating it with almost any outcome."

and the subtitle

Definitive solutions won’t come from another million observational papers or small randomized trials“.

Being a bit obsessive about causality, this paper is music to my ears. The problem of causality was understood perfectly by Samuel Johnson, in 1756, and he was a lexicographer, not a scientist. Yet it’s widely ignored by epidemiologists.

sj

The problem of causality is often mentioned in the introduction to papers that describe survey data, yet by the end of the paper, it’s usually forgotten, and public health advice is issued.

Ioannidis’ editorial vindicates my own views, as an amateur epidemiologist, on the results of the endless surveys of diet and health.

There is nothing new about the problem. It’s been written about many times. Young & Karr (Significance, 8, 116 – 120, 2011: get pdf) said "Any claim coming from an observational study is most likely to be wrong". Out of 52 claims made in 12 observational studies, not a single one was confirmed when tested by randomised controlled trials.

Another article cited by Ioannidis, "Myths, Presumptions, and Facts about Obesity" (Casazza et al , NEJM, 2013), debunks many myths, but the list of conflicts of interests declared by the authors is truly horrendous (and at least one of their conclusions has been challenged, albeit by people with funding from Kellogg’s). The frequent conflicts of interest in nutrition research make a bad situation even worse.

The quotation in bold type continues thus.

"On 25 October 2013, PubMed listed 291 papers with the keywords “coffee OR caffeine” and 741 with “soy,” many of which referred to associations. In this literature of epidemic proportions, how many results are correct? Many findings are entirely implausible. Relative risks that suggest we can halve the burden of cancer with just a couple of servings a day of a single nutrient still circulate widely in peer reviewed journals.

However, on the basis of dozens of randomized trials, single nutrients are unlikely to have relative risks less than 0.90 for major clinical outcomes when extreme tertiles of population intake are compared—most are greater than 0.95. For overall mortality, relative risks are typically greater than 0.995, if not entirely null. The respective absolute risk differences would be trivial. Observational studies and even randomized trials of single nutrients seem hopeless, with rare exceptions. Even minimal confounding or other biases create noise that exceeds any genuine effect. Big datasets just confer spurious precision status to noise."

And, later,

"According to the latest burden of disease study, 26% of deaths and 14% of disability adjusted life years in the United States are attributed to dietary risk factors, even without counting the impact of obesity. No other risk factor comes anywhere close to diet in these calculations (not even tobacco and physical inactivity). I suspect this is yet another implausible result. It builds on risk estimates from the same data of largely implausible nutritional studies discussed above. Moreover, socioeconomic factors are not considered at all, although they may be at the root of health problems. Poor diet may partly be a correlate or one of several paths through which social factors operate on health."

Another field that is notorious for producing false positives, wirh false attribution of causality, is the detection of biomarkers. A critical discussion can be found in the paper by Broadhurst & Kell (2006), "False discoveries in metabolomics and related experiments".

"Since the early days of transcriptome analysis (Golub et al., 1999), many workers have looked to detect different gene expression in cancerous versus normal tissues. Partly because of the expense of transcriptomics (and the inherent noise in such data (Schena, 2000; Tu et al., 2002; Cui and Churchill, 2003; Liang and Kelemen, 2006)), the numbers of samples and their replicates is often small while the number of candidate genes is typically in the thousands. Given the above, there is clearly a great danger that most of these will not in practice withstand scrutiny on deeper analysis (despite the ease with which one can create beautiful heat maps and any number of ‘just-so’ stories to explain the biological relevance of anything that is found in preliminary studies!). This turns out to be the case, and we review a recent analysis (Ein-Dor et al., 2006) of a variety of such studies."

The fields of metabolomics, proteomics and transcriptomics are plagued by statistical problems (as well as being saddled with ghastly pretentious names).

What’s to be done?

Barker Bausell, in his demolition of research on acupuncture, said:

[Page39] “But why should nonscientists care one iota about something as esoteric as causal inference? I believe that the answer to this question is because the making of causal inferences is part of our job description as Homo Sapiens.”

The problem, of course, is that humans are very good at attributing causality when it does not exist. That has led to confusion between correlation and cause on an industrial scale, not least in attempts to work out the effects of diet on health.

More than in any other field it is hard to do the RCTs that could, in principle, sort out the problem. It’s hard to allocate people at random to different diets, and even harder to make people stick to those diets for the many years that are needed.

We can probably say by now that no individual food carries a large risk, or affords very much protection. The fact that we are looking for quite small effects means that even when RCTs are possible huge samples will be needed to get clear answers. Most RCTs are too short, and too small (under-powered) and that leads to overestimation of the size of effects.

That’s a problem that plagues experimental pyschology too, and has led to a much-discussed crisis in reproducibility.

"Supplements" of one sort and another are ubiquitous in sports. Nobody knows whether they work, and the margin between winning and losing is so tiny that it’s very doubtful whether we ever will know. We can expect irresponsible claims to continue unabated.

The best thing that can be done in the short term is to stop doing large observational studies altogether. It’s now clear that inferences made from them are likely to be wrong. And, sad to say, we need to view with great skepticism anything that is funded by the food industry. And make a start on large RCTs whenever that is possible. Perhaps the hardest goal of all is to end the "publish or perish" culture which does so much to prevent the sort of long term experiments which would give the information we want.

Ioannidis’ article ends with the statement

"I am co-investigator in a randomized trial of a low carbohydrate versus low fat diet that is funded by the US National Institutes of Health and the non-profit Nutrition Science Initiative."

It seems he is putting his money where his mouth is.

Until we have the results, we shall continue to be bombarded with conflicting claims made by people who are doing their best with flawed methods, as well as by those trying to sell fad diets. Don’t believe them. The famous "5-a-day" advice that we are constantly bombarded with does no harm, but it has no sound basis.

As far as I can guess, the only sound advice about healthy eating for most people is

  • don’t eat too much
  • don’t eat all the same thing

You can’t make much money out of that advice.

No doubt that is why you don’t hear it very often.

Follow-up

Two relevant papers that show the unreliability of observational studies,

"Nearly 80,000 observational studies were published in the decade 1990–2000 (Naik 2012). In the following decade, the number of studies grew to more than 260,000". Madigan et al. (2014)

“. . . the majority of observational studies would declare statistical significance when no effect is present” Schuemie et al., (2012)

20 March 2014

On 20 March 2014, I gave a talk on this topic at the Cambridge Science Festival (more here). After the event my host, Yvonne Noblis, sent me some (doubtless cherry-picked) feedback she’d had about the talk.

tweets

Jump to follow-up

This is a very important book.

Buy it now (that link is to Waterstone’s Amazon don’t pay tax in the UK, so don’t use them).

When you’ve read it, do something about it. The book has lots of suggestions about what to do.

bg gif

Stolen from badscience.net

 

Peter Medawar, the eminent biologist, in his classic book Advice to a Young Scientist, said this.

 “Exaggerated claims for the efficacy of a medicament are very seldom the consequence of any intention to deceive; they are usually the outcome of a kindly conspiracy in which everybody has the very best intentions. The patient wants to get well, his physician wants to have made him better, and the pharmaceutical company would have liked to have put it into the physician’s power to have made him so. The controlled clinical trial is an attempt to avoid being taken in by this conspiracy of good will.”

There was a lot of truth in that 1979, towards the end of the heyday of small molecule pharmacology.  Since then, one can argue, things have gone downhill.

First, though, think of life without general anaesthetics, local anaesthetics, antibiotics, anticoagulants and many others.  They work well and have done incalculable good.  And they were developed by the drug industry.

But remember also that remarkably little is known about medicine.  There are huge areas in which neither causes nor cures are known.  Treatments for chronic pain, back problems, many sorts of cancer and almost all mental problems are a mess.  It just isn’t known what to do.  Nobody is to blame for this.  Serious medical research has been going on for little more than 60 years, and it turns out to be very complicated.  We are doing our best, but are still ignorant about whole huge areas. That leads to a temptation to make things up. Clutching at straws is very evident when it comes to depression, pain and Alzheimer’s disease, among others.

In order to improve matters, one essential is to do fair tests on treatments that we have.  Ben Goldacre’s book is a superb account of how this could be done, and how the process of testing has been subverted for commercial gain and to satisfy the vanities of academics.

Of course there is nothing new in criticisms of Big Pharma.  The huge fines levied on them for false advertising are well known.  The difference is that Goldacre’s book explains clearly what’s gone wrong in great detail, documents it thoroughly, and makes concrete suggestions for improving matters.

Big Pharma has undoubtedly sometimes behaved appallingly in recent years. Someone should be in jail for crimes against patients.  They have behaved in much the same way that bankers have. In any huge globalised industry it is always possible to blame someone in another department for the dishonesty.  But they aren’t the only people to blame.  None of the problems could have arisen with the complicity of academics, universities, and a plethora of regulatory agencies and professional bodies.

The biggest scandal of all is missing data (chapter 1).  Companies, and sometmes academics, have suppressed of trials that don’t favour the drugs that they are trying to sell.  The antidepressant drug, reboxetine, appeared at first to be good. It had been approved by the Medicines and Healthcare products Regulatory Agency (MHRA) and there was at least one good randomized placebo-controlled trial (RCT) showing it worked.  But it didn’t.  The manufacturer didn’t provide a complete list of unpublished trials when asked for them.  After much work it was found in 2010 that, as well as the published, favourable trial, there were six more trials which had not been published and all six showed reboxetine to be no better than placebo .  In comparisons with other antidepressant drugs three small studies (507 patients) showed reboxetine to be as good as its competitors.  These were published. But it came to light that data on 1657 patients had never been published and these showed reboxetine to be worse than its rivals.

When all the data for the SSRI antidepressants were unearthed (Kirsch et al., 2008) it turned out that they were no better than placebo for mild or moderate depression. This selective suppression of negative data has happened time and time again. It harms patients and deceives doctors, but, incredibly, it’s not illegal.

Disgracefully, Kirsch et al. had to use a Freedom of Information Act request to get the data from the FDA.

“The output of a regulator is often simply a crude, brief summary: almost a ‘yes’ or ‘no’ about side effects. This is the opposite of science, which is only reliable because everyone shows their working, explains how they know that something is effective or safe, shares their methods and their results, and allows others to decide if they agree with the way they processed and analysed the data.”

 

“the NICE document discussing whether it’s a good idea to have Lucentis, an extremely expensive drug, costing well over £ 1,000 per treatment, that is injected into the eye for a condition called acute macular degeneration. As you can see, the NICE document on whether this treatment is a good idea is censored. Not only is the data on the effectiveness of the treatment blanked out by thick black rectangles, in case any doctor or patient should see it, but absurdly, even the names of some trials are missing, preventing the reader from even knowing of their existence, or cross referencing information about them.Most disturbing of all, as you can see in the last bullet point, the data on adverse events is also censored.”

nice

 

The book lists all the tricks that are used by both industry and academics. Here are some of them.

  • Regulatory agencies like the MHRA, the European Medicines Agency (EMA) and the US Food and Drugs Administration (FDA) set a low bar for approval of drugs.
  •  Companies make universities sign gagging agreements which allow unfavourable results to be suppressed, and their existence hidden.
  • Accelerated approval schemes are abused to get quick approval of ineffective drugs and the promised proper tests often don’t materialise
  • Disgracefully, even when all the results have been given to the regulatory agencies (which isn’t always). The MHRA, EMA and FDA don’t make them public. We are expected to take their word.
  • Although all clinical trials are meant to be registered before they start, the EMA register, unbelievably, is not public.  Furthermore there is no check that the results if trials ever get published.  Despite mandates that results must be published within a year of finishing the trial, many aren’t.  Journals promise to check this sort of thing, but they don’t.
  • When the results are published, it is not uncommon for the primary outcome, specified before it started, to have been changed to one that looks like a more favourable result.  Journals are meant to check, but mostly don’t.
  • Companies use scientific conferences, phony journals, make-believe “seed trials” and “continuing medical education” for surreptitious advertising.
  • Companies invent new diseases, plant papers to make you think you’re abnormal, and try to sell you a “cure”.  For example, female sexual dysfunction , restless legs syndrome and social anxiety disorder (i.e. shyness).  This is called disease-mongering, medicalisation or over-diagnosis. It’s bad.
  • Spin is rife. Companies, and authors, want to talk up their results. University PR departments want to exaggerate benefits. Journal editors want sensational papers. Read the results, not the summary. This is universal (but particularly bad in alternative medicine).
  • Companies fund patient groups to lobby for pills even when the pills are known to be ineffective.  The lobby that demanded that Herceptin should be available to all on the breast cancer patients on the NHS was organised by a PR company working for the manufacturer, Roche.  But Herceptin doesn’t work at all in 80% of patients and gives you at best a few extra months of  life in advanced cases.
  • Ghostwriting of papers is serious corruption.  A company writes the paper and senior academics appear as the authors, though they may never have seen the original data.  Even in cases where academics have admitted to lying about whether they have seen the data, they go unpunished by their universities. See for example, the case of Professor Eastell.
  • By encouraging the funding of “continuing medical education” by companies, the great and the good of academic medicine have let us down badly.

This last point is where the book ends, and it’s worth amplification.

“So what have the great and good of British medicine done to help patients, in the face of this endemic corruption, and these systematic flaws? In 2012, a collaborative document was produced by senior figures in medicine from across the board, called ‘Guidance on Collaboration Between Healthcare Professionals and the Pharmaceutical Industry’. This document was jointly approved by the ABPI, the Department of Health, the Royal Colleges of Physicians, Nursing, Psychiatrists, GPs, the Lancet, the British Medical Association, the NHS Confederation, and so on. ”

“It contains no recognition of the serious problems we have seen in this book. In fact, quite the opposite: it makes a series of assertions about them that are factually incorrect.”

“It states that drug reps ‘can be a useful resource for healthcare professionals’. Again, I’m not sure why the Royal Colleges, the BMA, the Department of Health and the NHS Confederation felt the need to reassert this to the doctors of the UK, on behalf of industry, when the evidence shows that drug reps actively distort prescribing practices. But that is the battle you face, trying to get these issues taken seriously by the pinnacle of the medical establishment.”

This is perhaps the most shameful betrayal of all.  The organisations that should protect patients have sold them out.

You may have been sold out by your “elders and betters”, but you can do something. The “What to do” sections of the book should be produced as a set of flash cards, as a reminder that matters can be improved.

It is shameful that this book was not written by a clinical pharmacologist, or a senior doctor, or a Royal College, or a senior academic.  Why has the British Pharmacological Society said nothing?

It is shameful too that this book was not written by one of the quacks who are keen to defend the $60 billion alternative medicine industry (which has cured virtually nothing) and who are strident in their criticism of the 600 billion dollar Pharma industry.  They haven’t done the work that Goldacre has to analyse the real problems.  All they have done is to advocate unfair tests, because that is the only sort their treatments can pass.

It’s weird that medicine, the most caring profession, is more corrupt than any other branch of science.  The reason, needless to say, is money. Well, money and vanity.  The publish or perish mentality of senior academics encourages dishonesty. It is a threat to honest science.

Goldacre’s book shows the consequences: harm to patients and huge wastage of public money.

Read it.

Do something.

 

Follow-up

7 October, 2012, The Observer

Goldacre wrote

"I think it’s really disappointing that nobody, not the Royal Colleges, the Academy of Medical Sciences, the British Pharmacological Society, the British Medical Association, none of these organisations have stood up and said: selective non-publication of unflattering trial data is research misconduct, and if you do it you will be booted out. And I think they really urgently should."

Exactly.

Jump to follow-up

I have in the past, taken an occasional interest in the philosophy of science. But in a lifetime doing science, I have hardly ever heard a scientist mention the subject. It is, on the whole, a subject that is of interest only to philosophers.

It’s true that some philosophers have had interesting things to say about the nature of inductive inference, but during the 20th century the real advances in that area came from statisticians, not from philosophers. So I long since decided that it would be more profitable to spend my time trying to understand R.A Fisher, rather than read even Karl Popper. It is harder work to do that, but it seemed the way to go.
RA Fisher

This post is based on the last part of chapter titled In "Praise of Randomisation" . A talk was given at the meeting at the British Academy in December 2007, and the book will be launched on November 28th 2011 (good job it wasn’t essential for my CV with delays like that). The book is published by OUP for the British Academy, under the title Evidence, Inference and Enquiry (edited by Philip Dawid, William Twining, and Mimi Vasilaki, 504 pages, £85.00). The bulk of my contribution has already appeared here, in May 2009, under the heading Diet and health. What can you believe: or does bacon kill you?. It is one of the posts that has given me the most satisfaction, if only because Ben Goldacre seemed to like it, and he has done more than anyone to explain the critical importance of randomisation for assessing treatments and for assessing social interventions.

Having long since decided that it was Fisher, rather than philosophers, who had the answers to my questions, why bother to write about philosophers at all? It was precipitated by joining the London Evidence Group. Through that group I became aware that there is a group of philosophers of science who could, if anyone took any notice of them, do real harm to research. It seems surprising that the value of randomisation should still be disputed at this stage, and of course it is not disputed by anybody in the business.  It was thoroughly established after the start of small sample statistics at the beginning of the 20th century. Fisher’s work on randomisation and the likelihood principle put inference on a firm footing by the mid-1930s. His popular book, The Design of Experiments made the importance of randomisation clear to a wide audience, partly via his famous example of the lady tasting tea. The development of randomisation tests made it transparently clear (perhaps I should do a blog post on their beauty). By the 1950s. the message got through to medicine, in large part through Austin Bradford Hill.

Despite this, there is a body of philosophers who dispute it.  And of course it is disputed by almost all practitioners of alternative medicine (because their treatments usually fail the tests). Here are some examples.

“Why there’s no cause to randomise” is the rather surprising title of a report by Worrall (2004;  see also Worral, 2010), from the London School of Economics.  The conclusion of this paper is

“don’t believe the bad press that ‘observational studies’ or ‘historically controlled trials’ get – so long as they are properly done (that is, serious thought has gone in to the possibility of alternative explanations of the outcome), then there is no reason to think of them as any less compelling than an RCT.”

In my view this conclusion is seriously, and dangerously, wrong –it ignores the enormous difficulty of getting evidence for causality in real life, and it ignores the fact that historically controlled trials have very often given misleading results in the past, as illustrated by the diet problem..  Worrall’s fellow philosopher, Nancy Cartwright (Are RCTs the Gold Standard?, 2007), has made arguments that in some ways resemble those of Worrall. 

Many words are spent on defining causality but, at least in the clinical setting the meaning is perfectly simple.  If the association between eating bacon and colorectal cancer is causal then if you stop eating bacon you’ll reduce the risk of cancer.  If the relationship is not causal then if you stop eating bacon it won’t help at all.  No amount of Worrall’s “serious thought” will substitute for the real evidence for causality that can come only from an RCT: Worrall seems to claim that sufficient brain power can fill in missing bits of information.  It can’t.  I’m reminded inexorably of the definition of “Clinical experience. Making the same mistakes with increasing confidence over an impressive number of years.” In Michael O’Donnell’s A Sceptic’s Medical Dictionary. 

At the other philosophical extreme, there are still a few remnants of post-modernist rhetoric to be found in obscure corners of the literature.  Two extreme examples are the papers by Holmes et al. and by Christine Barry.  Apart from the fact that they weren’t spoofs, both of these papers bear a close resemblance to Alan Sokal’s famous spoof paper, Transgressing the boundaries: towards a transformative hermeneutics of quantum gravity (Sokal, 1996).  The acceptance of this spoof by a journal, Social Text, and the subsequent book, Intellectual Impostures, by Sokal & Bricmont (Sokal & Bricmont, 1998), exposed the astonishing intellectual fraud if postmodernism (for those for whom it was not already obvious).    A couple of quotations will serve to give a taste of the amazing material that can appear in peer-reviewed journals.  Barry (2006) wrote

 “I wish to problematise the call from within biomedicine for more evidence of alternative medicine’s effectiveness via the medium of the randomised clinical trial (RCT).”

“Ethnographic research in alternative medicine is coming to be used politically as a challenge to the hegemony of a scientific biomedical construction of evidence.”

“The science of biomedicine was perceived as old fashioned and rejected in favour of the quantum and chaos theories of modern physics.”
“In this paper, I have deconstructed the powerful notion of evidence within biomedicine, . . .”

The aim of this paper, in my view, is not obtain some subtle insight into the process of inference but to try to give some credibility to snake-oil salesmen who peddle quack cures.  The latter at least make their unjustified claims in plain English.

The similar paper by Holmes, Murray, Perron & Rail (Holmes et al., 2006) is even more bizarre.

“Objective The philosophical work of Deleuze and Guattari proves to be useful in showing how health sciences are colonised (territorialised) by an all-encompassing scientific research paradigm “that of post-positivism ” but also and foremost in showing the process by which a dominant ideology comes to exclude alternative forms of knowledge, therefore acting as a fascist structure. “,

It uses the word fascism, or some derivative thereof, 26 times.  And  Holmes, Perron & Rail (Murray et al., 2007)) end a similar tirade with

“We shall continue to transgress the diktats of State Science.”

It may be asked why it is even worth spending time on these remnants of the utterly discredited postmodernist movement.  One reason is that rather less extreme examples of similar thinking still exist in some philosophical circles. 

Take, for example, the views expressed papers such as Miles, Polychronis and Grey (2006), Miles & Loughlin (2006), Miles, Loughlin & Polychronis (Miles et al., 2007) and Loughlin (2007)..  These papers form part of the authors’ campaign against evidence-based medicine, which they seem to regard as some sort of ideological crusade, or government conspiracy.  Bizarrely they seem to think that evidence-based medicine has something in common with the managerial culture that has been the bane of not only medicine but of almost every occupation (and which is noted particularly for its disregard for evidence).  Although couched in the sort of pretentious language favoured by postmodernists, in fact it ends up defending the most simple-minded forms of quackery.  Unlike  Barry (2006), they don’t mention alternative medicine explicitly, but the agenda is clear from their attacks on Ben Goldacre.  For example, Miles, Loughlin & Polychronis (Miles et al., 2007) say this.

“Loughlin identifies Goldacre [2006] as a particularly luminous example of a commentator who is able not only to combine audacity with outrage, but who in a very real way succeeds in manufacturing a sense of having been personally offended by the article in question. Such moralistic posturing acts as a defence mechanism to protect cherished assumptions from rational scrutiny and indeed to enable adherents to appropriate the ‘moral high ground’, as well as the language of ‘reason’ and ‘science’ as the exclusive property of their own favoured approaches. Loughlin brings out the Orwellian nature of this manoeuvre and identifies a significant implication.”

If Goldacre and others really are engaged in posturing then their primary offence, at least according to the Sartrean perspective adopted by Murray et al. is not primarily intellectual, but rather it is moral. Far from there being a moral requirement to ‘bend a knee’ at the EBM altar, to do so is to violate one’s primary duty as an autonomous being.”

This ferocious attack seems to have been triggered because Goldacre has explained in simple words what constitutes evidence and what doesn’t.  He has explained in a simple way how to do a proper randomised controlled trial of homeopathy.  And he he dismantled a fraudulent Qlink pendant, purported to shield you from electromagnetic radiation but which turned out to have no functional components (Goldacre, 2007).  This is described as being “Orwellian”, a description that seems to me to be downright bizarre.

In fact, when faced with real-life examples of what happens when you ignore evidence, those who write theoretical papers that are critical about evidence-based medicine may behave perfectly sensibly.  Although Andrew Miles edits a journal, (Journal of Evaluation in Clinical Practice), that has been critical of EBM for years. Yet when faced with a course in alternative medicine run by people who can only be described as quacks, he rapidly shut down the course (A full account has appeared on this blog).

It is hard to decide whether the language used in these papers is Marxist or neoconservative libertarian.  Whatever it is, it clearly isn’t science.  It may seem odd that postmodernists (who believe nothing) end up as allies of quacks (who’ll believe anything).  The relationship has been explained with customary clarity by Alan Sokal, in his essay Pseudoscience and Postmodernism: Antagonists or Fellow-Travelers? (Sokal, 2006).

Conclusions

Of course RCTs are not the only way to get knowledge. Often they have not been done, and sometimes it is hard to imagine how they could be done (though not nearly as often as some people would like to say).

It is true that RCTs tell you only about an average effect in a large population. But the same is true of observational epidemiology.  That limitation is nothing to do with randomisation, it is a result of the crude and inadequate way in which diseases are classified (as discussed above).   It is also true that randomisation doesn’t guarantee lack of bias in an individual case, but only in the long run.  But it is the best that can be done.  The fact remains that randomization is the only way to be sure of causality, and making mistakes about causality can harm patients, as it did in the case of HRT.

Raymond Tallis (1999), in his review of Sokal & Bricmont, summed it up nicely

“Academics intending to continue as postmodern theorists in the interdisciplinary humanities after S & B should first read Intellectual Impostures and ask themselves whether adding to the quantity of confusion and untruth in the world is a good use of the gift of life or an ethical way to earn a living. After S & B, they may feel less comfortable with the glamorous life that can be forged in the wake of the founding charlatans of postmodern Theory.  Alternatively, they might follow my friend Roger into estate agency — though they should check out in advance that they are up to the moral rigours of such a profession.”

The conclusions that I have drawn were obvious to people in the business a half a century ago. (Doll & Peto, 1980) said

"If we are to recognize those important yet moderate real advances in therapy which can save thousands of lives, then we need more large randomised trials than at present, not fewer. Until we have them treatment of future patients will continue to be determined by unreliable evidence."

The towering figures are R.A. Fisher, and his followers who developed the ideas of randomisation and maximum likelihood estimation. In the medical area, Bradford Hill, Archie Cochrane, Iain Chalmers had the important ideas worked out a long time ago.

In contrast, philosophers like Worral, Cartwright, Holmes, Barry, Loughlin and Polychronis seem to me to make no contribution to the accumulation of useful knowledge, and in some cases to hinder it.  It’s true that the harm they do is limited, but that is because they talk largely to each other.  Very few working scientists are even aware of their existence.  Perhaps that is just as well.

References

Cartwright N (2007). Are RCTs the Gold Standard? Biosocieties (2007), 2: 11-20

Colquhoun, D (2010) University of Buckingham does the right thing. The Faculty of Integrated Medicine has been fired. http://www.dcscience.net/?p=2881

Miles A & Loughlin M (2006). Continuing the evidence-based health care debate in 2006. The progress and price of EBM. J Eval Clin Pract 12, 385-398.

Miles A, Loughlin M, & Polychronis A (2007). Medicine and evidence: knowledge and action in clinical practice. J Eval Clin Pract 13, 481-503.

Miles A, Polychronis A, & Grey JE (2006). The evidence-based health care debate – 2006. Where are we now? J Eval Clin Pract 12, 239-247.

Murray SJ, Holmes D, Perron A, & Rail G (2007).
Deconstructing the evidence-based discourse in health sciences: truth, power and fascis. Int J Evid Based Healthc 2006; : 4, 180–186.

Sokal AD (1996). Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity. Social Text 46/47, Science Wars, 217-252.

Sokal AD (2006). Pseudoscience and Postmodernism: Antagonists or Fellow-Travelers? In Archaeological Fantasies, ed. Fagan GG, Routledge,an imprint of Taylor & Francis Books Ltd.

Sokal AD & Bricmont J (1998). Intellectual Impostures, New edition, 2003, Economist Books ed. Profile Books.

Tallis R. (1999) Sokal and Bricmont: Is this the beginning of the end of the dark ages in the humanities?

Worrall J. (2004) Why There’s No Cause to Randomize. Causality: Metaphysics and Methods.Technical Report 24/04 . 2004.

Worrall J (2010). Evidence: philosophy of science meets medicine. J Eval Clin Pract 16, 356-362.

Follow-up

Iain Chalmers has drawn my attention to a some really interesting papers in the James Lind Library

An account of early trials is given by Chalmers I, Dukan E, Podolsky S, Davey Smith G (2011). The adoption of unbiased treatment allocation schedules in clinical trials during the 19th and early 20th centuries.  Fisher was not the first person to propose randomised trials, but he is the person who put it on a sound mathematical basis.

Another fascinating paper is Chalmers I (2010). Why the 1948 MRC trial of streptomoutycin used treatment allocation based on random numbers.

The distinguished statistician, David Cox contributed, Cox DR (2009). Randomization for concealment.

Incidentally, if anyone still thinks there are ethical objections to random allocation, they should read the account of retrolental fibroplasia outbreak in the 1950s, Silverman WA (2003). Personal reflections on lessons learned from randomized trials involving newborn infants, 1951 to 1967.

Chalmers also pointed out that Antony Eagle of Exeter College Oxford had written about Goldacre’s epistemology. He describes himself as a "formal epistemologist". I fear that his criticisms seem to me to be carping and trivial. Once again, a philosopher has failed to make a contribution to the progress of knowledge.

Jump to follow-up

One wonders about the standards of peer review at the British Journal of General Practice. The June issue has a paper, "Acupuncture for ‘frequent attenders’ with medically unexplained symptoms: a randomised controlled trial (CACTUS study)". It has lots of numbers, but the result is very easy to see. Just look at their Figure.

Paterson BJGP

There is no need to wade through all the statistics; it’s perfectly obvious at a glance that acupuncture has at best a tiny and erratic effect on any of the outcomes that were measured.

But this is not what the paper said. On the contrary, the conclusions of the paper said

Conclusion

The addition of 12 sessions of five-element acupuncture to usual care resulted in improved health status and wellbeing that was sustained for 12 months.

How on earth did the authors manage to reach a conclusion like that?

The first thing to note is that many of the authors are people who make their living largely from sticking needles in people, or advocating alternative medicine. The authors are Charlotte Paterson, Rod S Taylor, Peter Griffiths, Nicky Britten, Sue Rugg, Jackie Bridges, Bruce McCallum and Gerad Kite, on behalf of the CACTUS study team. The senior author, Gerad Kite MAc , is principal of the London Institute of Five-Element Acupuncture London. The first author, Charlotte Paterson, is a well known advocate of acupuncture. as is Nicky Britten.

The conflicts of interest are obvious, but nonetheless one should welcome a “randomised controlled trial” done by advocates of alternative medicine. In fact the results shown in the Figure are both interesting and useful. They show that acupuncture does not even produce any substantial placebo effect. It’s the authors’ conclusions that are bizarre and partisan. Peer review is indeed a broken process.

That’s really all that needs to be said, but for nerds, here are some more details.

How was the trial done?

The description "randomised" is fair enough, but there were no proper controls and the trial was not blinded. It was what has come to be called a "pragmatic" trial, which means a trial done without proper controls. They are, of course, much loved by alternative therapists because their therapies usually fail in proper trials. It’s much easier to get an (apparently) positive result if you omit the controls. But the fascinating thing about this study is that, despite the deficiencies in design, the result is essentially negative.

The authors themselves spell out the problems.

“Group allocation was known by trial researchers, practitioners, and patients”

So everybody (apart from the statistician) knew what treatment a patient was getting. This is an arrangement that is guaranteed to maximise bias and placebo effects.

"Patients were randomised on a 1:1 basis to receive 12 sessions of acupuncture starting immediately (acupuncture group) or starting in 6 months’ time (control group), with both groups continuing to receive usual care."

So it is impossible to compare acupuncture and control groups at 12 months, contrary to what’s stated in Conclusions.

"Twelve sessions, on average 60 minutes in length, were provided over a 6-month period at approximately weekly, then fortnightly and monthly intervals"

That sounds like a pretty expensive way of getting next to no effect.

"All aspects of treatment, including discussion and advice, were individualised as per normal five-element acupuncture practice. In this approach, the acupuncturist takes an in-depth account of the patient’s current symptoms and medical history, as well as general health and lifestyle issues. The patient’s condition is explained in terms of an imbalance in one of the five elements, which then causes an imbalance in the whole person. Based on this elemental diagnosis, appropriate points are used to rebalance this element and address not only the presenting conditions, but the person as a whole".

Does this mean that the patients were told a lot of mumbo jumbo about “five elements” (fire earth, metal, water, wood)? If so, anyone with any sense would probably have run a mile from the trial.

"Hypotheses directed at the effect of the needling component of acupuncture consultations require sham-acupuncture controls which while appropriate for formulaic needling for single well-defined conditions, have been shown to be problematic when dealing with multiple or complex conditions, because they interfere with the participative patient–therapist interaction on which the individualised treatment plan is developed. 37–39 Pragmatic trials, on the other hand, are appropriate for testing hypotheses that are directed at the effect of the complex intervention as a whole, while providing no information about the relative effect of different components."

Put simply that means: we don’t use sham acupuncture controls so we can’t distinguish an effect of the needles from placebo effects, or get-better-anyway effects.

"Strengths and limitations: The ‘black box’ study design precludes assigning the benefits of this complex intervention to any one component of the acupuncture consultations, such as the needling or the amount of time spent with a healthcare professional."

"This design was chosen because, without a promise of accessing the acupuncture treatment, major practical and ethical problems with recruitment and retention of participants were anticipated. This is because these patients have very poor self-reported health (Table 3), have not been helped by conventional treatment, and are particularly desperate for alternative treatment options.". 

It’s interesting that the patients were “desperate for alternative treatment”. Again it seems that every opportunity has been given to maximise non-specific placebo, and get-well-anyway effects.

There is a lot of statistical analysis and, unsurprisingly, many of the differences don’t reach statistical significance. Some do (just) but that is really quite irrelevant. Even if some of the differences are real (not a result of random variability), a glance at the figures shows that their size is trivial.

My conclusions

(1) This paper, though designed to be susceptible to almost every form of bias, shows staggeringly small effects. It is the best evidence I’ve ever seen that not only are needles ineffective, but that placebo effects, if they are there at all, are trivial in size and have no useful benefit to the patient in this case..

(2) The fact that this paper was published with conclusions that appear to contradict directly what the data show, is as good an illustration as any I’ve seen that peer review is utterly ineffective as a method of guaranteeing quality. Of course the editor should have spotted this. It appears that quality control failed on all fronts.

Follow-up

In the first four days of this post, it got over 10,000 hits (almost 6,000 unique visitors).

Margaret McCartney has written about this too, in The British Journal of General Practice does acupuncture badly.

The Daily Mail exceeds itself in an article by Jenny Hope whch says “Millions of patients with ‘unexplained symptoms’ could benefit from acupuncture on the NHS, it is claimed”. I presume she didn’t read the paper.

The Daily Telegraph scarcely did better in Acupuncture has significant impact on mystery illnesses. The author if this, very sensibly, remains anonymous.

Many “medical information” sites churn out the press release without engaging the brain, but most of the other newspapers appear, very sensibly, to have ignored ther hyped up press release. Among the worst was Pulse, an online magazine for GPs. At least they’ve publish the comments that show their report was nonsense.

The Daily Mash has given this paper a well-deserved spoofing in Made-up medicine works on made-up illnesses.

“Professor Henry Brubaker, of the Institute for Studies, said: “To truly assess the efficacy of acupuncture a widespread double-blind test needs to be conducted over a series of years but to be honest it’s the equivalent of mapping the DNA of pixies or conducting a geological study of Narnia.” ”

There is no truth whatsoever in the rumour being spread on Twitter that I’m Professor Brubaker.

Euan Lawson, also known as Northern Doctor, has done another excellent job on the Paterson paper: BJGP and acupuncture – tabloid medical journalism. Most tellingly, he reproduces the press release from the editor of the BJGP, Professor Roger Jones DM, FRCP, FRCGP, FMedSci.

"Although there are countless reports of the benefits of acupuncture for a range of medical problems, there have been very few well-conducted, randomised controlled trials. Charlotte Paterson’s work considerably strengthens the evidence base for using acupuncture to help patients who are troubled by symptoms that we find difficult both to diagnose and to treat."

Oooh dear. The journal may have a new look, but it would be better if the editor read the papers before writing press releases. Tabloid journalism seems an appropriate description.

Andy Lewis at Quackometer, has written about this paper too, and put it into historical context. In Of the Imagination, as a Cause and as a Cure of Disorders of the Body. “In 1800, John Haygarth warned doctors how we may succumb to belief in illusory cures. Some modern doctors have still not learnt that lesson”. It’s sad that, in 2011, a medical journal should fall into a trap that was pointed out so clearly in 1800. He also points out the disgracefully inaccurate Press release issued by the Peninsula medical school.

Some tweets

Twitter info 426 clicks on http://bit.ly/mgIQ6e alone at 15.30 on 1 June (and that’s only the hits via twitter). By July 8th this had risen to 1,655 hits via Twitter, from 62 different countries,

@followthelemur Selina
MASSIVE peer review fail by the British Journal of General Practice http://bit.ly/mgIQ6e (via @david_colquhoun)

@david_colquhoun David Colquhoun
Appalling paper in Brit J Gen Practice: Acupuncturists show that acupuncture doesn’t work, but conclude the opposite http://bit.ly/mgIQ6e
Retweeted by gentley1300 and 36 others

@david_colquhoun David Colquhoun.
I deny the Twitter rumour that I’m Professor Henry Brubaker as in Daily Mash http://bit.ly/mt1xhX (just because of http://bit.ly/mgIQ6e )

@brunopichler Bruno Pichler
http://tinyurl.com/3hmvan4
Made-up medicine works on made-up illnesses (me thinks Henry Brubaker is actually @david_colquhoun)

@david_colquhoun David Colquhoun,
HEHE RT @brunopichler: http://tinyurl.com/3hmvan4 Made-up medicine works on made-up illnesses

@psweetman Pauline Sweetman
Read @david_colquhoun’s take on the recent ‘acupuncture effective for unexplained symptoms’ nonsense: bit.ly/mgIQ6e

@bodyinmind Body In Mind
RT @david_colquhoun: ‘Margaret McCartney (GP) also blogged acupuncture nonsense http://bit.ly/j6yP4j My take http://bit.ly/mgIQ6e’

@abritosa ABS
Br J Gen Practice mete a pata na poça: RT @david_colquhoun […] appalling acupuncture nonsense http://bit.ly/j6yP4j http://bit.ly/mgIQ6e

@jodiemadden Jodie Madden
amusing!RT @david_colquhoun: paper in Brit J Gen Practice shows that acupuncture doesn’t work,but conclude the opposite http://bit.ly/mgIQ6e

@kashfarooq Kash Farooq
Unbelievable: acupuncturists show that acupuncture doesn’t work, but conclude the opposite. http://j.mp/ilUALC by @david_colquhoun

@NeilOConnell Neil O’Connell
Gobsmacking spin RT @david_colquhoun: Acupuncturists show that acupuncture doesn’t work, but conclude the opposite http://bit.ly/mgIQ6e

@euan_lawson Euan Lawson (aka Northern Doctor)
Aye too right RT @david_colquhoun @iansample @BenGoldacre Guardian should cover dreadful acupuncture paper http://bit.ly/mgIQ6e

@noahWG Noah Gray
Acupuncturists show that acupuncture doesn’t work, but conclude the opposite, from @david_colquhoun: http://bit.ly/l9KHLv

 

8 June 2011 I drew the attention of the editor of BJGP to the many comments that have been made on this paper. He assured me that the matter would be discussed at a meeting of the editorial board of the journal. Tonight he sent me the result of this meeting.

Subject: BJGP
From: “Roger Jones” <rjones@rcgp.org.uk>
To: <d.colquhoun@ucl.ac.uk>

Dear Prof Colquhoun

We discussed your emails at yesterday’s meeting of the BJGP Editorial Board, attended by 12 Board members and the Deputy Editor

The Board was unanimous in its support for the integrity of the Journal’s peer review process for the Paterson et al paper – which was accepted after revisions were made in response to two separate rounds of comments from two reviewers and myself – and could find no reason either to retract the paper or to release the reviewers’ comments

Some Board members thought that the results were presented in an overly positive way; because the study raises questions about research methodology and the interpretation of data in pragmatic trials attempting to measure the effects of complex interventions, we will be commissioning a Debate and Analysis article on the topic.

In the meantime we would encourage you to contribute to this debate throught the usual Journal channels

Roger Jones

Professor Roger Jones MA DM FRCP FRCGP FMedSci FHEA FRSA
Editor, British Journal of General Practice

Royal College of General Practitioners
One Bow Churchyard
London EC4M 9DQ
Tel +44 203 188 7400

It is one thing to make a mistake, It is quite another thing to refuse to admit it. This reply seems to me to be quite disgraceful.

20 July 2011. The proper version of the story got wider publicity when Margaret McCartney wrote about it in the BMJ. The first rapid response to this article was a lengthy denial by the authors of the obvious conclusion to be drawn from the paper. They merely dig themselves deeper into a hole. The second response was much shorter (and more accurate).

Thank you Dr McCartney

Richard Watson, General Practitioner
Glasgow

The fact that none of the authors of the paper or the editor of BJGP have bothered to try and defend themselves speaks volumes.

Like many people I glanced at the report before throwing it away with an incredulous guffaw. You bothered to look into it and refute it – in a real journal. That last comment shows part of the problem with them publishing, and promoting, such drivel. It makes you wonder whether anything they publish is any good, and that should be a worry for all GPs.

 

30 July 2011. The British Journal of General Practice has published nine letters that object to this study. Some of them concentrate on problems with the methods. others point out what I believe to be the main point, there us essentially no effect there to be explained. In the public interest, I am posting the responses here [download pdf file]

Thers is also a response from the editor and from the authors. Both are unapologetic. It seems that the editor sees nothing wrong with the peer review process.

I don’t recall ever having come across such incompetence in a journal’s editorial process.

Here’s all he has to say.

The BJGP Editorial Board considered this correspondence recently. The Board endorsed the Journal’s peer review process and did not consider that there was a case for retraction of the paper or for releasing the peer reviews. The Board did, however, think that the results of the study were highlighted by the Journal in an overly-positive manner. However,many of the criticisms published above are addressed by the authors themselves in the full paper.

 

If you subscribe to the views of Paterson et al, you may want to buy a T-shirt that has a revised version of the periodic table.

t shirt

 

5 August 2011. A meeting with the editor of BJGP

Yesterday I met a member of the editorial board of BJGP. We agreed that the data are fine and should not be retracted. It’s the conclusions that should be retracted. I was also told that the referees’ reports were "bland". In the circumstances that merely confirmed my feeling that the referees failed to do a good job.

Today I met the editor, Roger Jones, himself. He was clearly upset by my comment and I have now changed it to refer to the whole editorial process rather than to him personally. I was told, much to my surprise, that the referees were not acupuncturists but “statisticians”. That I find baffling. It soon became clear that my differences with Professor Jones turned on interpretations of statistics.

It’s true that there were a few comparisons that got below P = 0.05, but the smallest was P = 0.02. The warning signs are there in the Methods section: "all statistical tests were …. deemed to be statistically significant if P < 0.05". This is simply silly -perhaps they should have read Lectures on Biostatistics. Or for a more recent exposition, the XKCD cartoon in which it’s proved that green jelly beans are linked to acne (P = 0.05). They make lots of comparisons but make no allowance for this in the statistics. Figure 2 alone contains 15 different comparisons: it’s not surprising that a few come out "significant", even if you don’t take into account the likelihood of systematic (non-random) errors when comparing final values with baseline values.

Keen though I am on statistics, this is a case where I prefer the eyeball test. It’s so obvious from the Figure that there’s nothing worth talking about happening, it’s a waste of time and money to torture the numbers to get "significant" differences. You have to be a slavish believer in P values to treat a result like that as anything but mildly suggestive. A glance at the Figure shows the effects, if there are any at all, are trivial.

I still maintain that the results don’t come within a million miles of justifying the authors’ stated conclusion “The addition of 12 sessions of five-element acupuncture to usual care resulted in improved health status and wellbeing that was sustained for 12 months.” Therefore I still believe that a proper course would have been to issue a new and more accurate press release. A brief admission that the interpretation was “overly-positive”, in a journal that the public can’t see, simply isn’t enough.

I can’t understand either, why the editorial board did not insist on this being done. If they had done so, it would have been temporarily embarrassing, certainly, but people make mistakes, and it would have blown over. By not making a proper correction to the public, the episode has become a cause célèbre and the reputation oif the journal will suffer permanent harm. This paper is going to be cited for a long time, and not for the reasons the journal would wish.

Misinformation, like that sent to the press, has serious real-life consequences. You can be sure that the paper as it still stands, will be cited by every acupuncturist who’s trying to persuade the Department of Health that he’s a "qualified provider".

There was not much unanimity in the discussion up to this point, Things got better when we talked about what a GP should do when there are no effective options. Roger Jones seemed to think it was acceptable to refer them to an alternative practitioner if that patient wanted it. I maintained that it’s unethical to explain to a patient how medicine works in terms of pre-scientific myths.

I’d have love to have heard the "informed consent" during which "The patient’s condition is explained in terms of imbalance in the five elements which then causes an imbalance in the whole person". If anyone had tried to explain my conditions in terms of my imbalance in my Wood, Water, Fire, Earth and Metal. I’d think they were nuts. The last author. Gerad Kite, runs a private clinic that sells acupuncture for all manner of conditions. You can find his view of science on his web site. It’s condescending and insulting to talk to patients in these terms. It’s the ultimate sort of paternalism. And paternalism is something that’s supposed to be vanishing in medicine. I maintained that this was ethically unacceptable, and that led to a more amicable discussion about the possibility of more honest placebos.

It was good of the editor to meet me in the circumstances. I don’t cast doubt on the honesty of his opinions. I simply disagree with them, both at the statistical level and the ethical level.

 

30 March 2014

I only just noticed that one of the authors of the paper, Bruce McCallum (who worked as an acupuncturist at Kite’s clinic) appeared in a 2007 Channel 4 News piece. I was a report on the pressure to save money by stopping NHS funding for “unproven and disproved treatments”. McCallum said that scientific evidence was needed to show that acupuncture really worked. Clearly he failed, but to admit that would have affected his income.

Watch the video (McCallum appears near the end).

Jump to follow-up

The King’s Fund recently published Assessing complementary practice Building consensus on appropriate research methods [or download pdf].

Report title

It is described as being the “Report of an independent advisory group”. I guess everyone knows by now that an “expert report” can be produced to back any view whatsoever simply by choosing the right “experts”, so the first things one does is to see who wrote it.  Here they are.

  • Chair: Professor Dame Carol Black
  • Harry Cayton, Chief Executive, Council for Healthcare Regulatory Excellence
  • Professor Adrian Eddleston, then Vice-Chairman, The King’s Fund
  • Professor George Lewith, Professor of Health Research, Complementary and Integrated Medicine Research Unit, University of Southampton
  • Professor Stephen Holgate, MRC Clinical Professor of Immunopharmacology, University of Southampton
  • Professor Richard Lilford, Head of School of Health and Population Sciences, University of Birmingham

We see at once two of the best known apologists for alternative medicine, George Lewith (who has appeared here more than once) and Stephen Holgate

Harry Cayton is CEO of Council for Healthcare Regulatory Excellence (CHRE) which must be one of the most useless box-ticking quangos in existence. It was the CHRE that praised the General Chiropractic Council (GCC) for the quality of its work.  That is the same GCC that is at present trying to cope with 600 or so complaints about the people it is supposed to regulate (not to mention a vast number of complaints to Trading Standards Offices).  The GCC must be the prime example of the folly of giving government endorsement to things that don’t work. But the CHRE were not smart enough to spot that little problem.  No doubt Mr Cayton did good work for the Alzheimer’s Society.  His advocacy of patient’s choice may have helped me personally.  But it isn’t obvious to me that he is the least qualified to express an opinion on research methods in anything whatsoever. According to the Guardian he is “BA in English and linguistics from the University of Ulster; diploma in anthropology from the University of Durham; B Phil in philosophy of education from the University of Newcastle.”

Adrian Eddlestone is a retired Professor of Medicine. He has been in academic administration since 1983. His sympathy for alternative medicine is demonstrated by the fact that he is also Chair of the General Osteopathic Council, yet another “regulator” that has done nothing to protect the public
from false health claims (and which may, soon, find itself in the same sort of trouble as the GCC).

Richard Lilford is the only member of the group who has no bias towards alternative medicine and also the only member with expertise in clinical research methods  His credentials look impressive, and his publications show how he is the ideal person for this job. I rather liked also his article Stop meddling and let us get on.. He has written about the harm done by postmodernism and relativism, the fellow-travellers of alternative medicine.

Most damning of all, Lewith, Eddlestone and Holgate (along with Cyril Chantler, chair of the King’s Fund, and homeopaths, spiritual healers and Karol Sikora) are Foundation Fellows of the Prince of Wales Foundation for Magic Medicine, an organisation that is at the forefront of spreading medical misinformation.

I shall refer here to ‘alternative medicine’ rather than ‘complementary medicine’ which is used in the report. It is not right to refer to a treatment as ‘complementary’ until such time as it has been shown to work. The term ‘complementary’ is a euphemism that, like ‘integrative’, is standard among alternative medicine advocates whose greatest wish is to gain respectability.

The Report

Kings Fund logo

The recommendations

On page 10 we find a summary of the conclusions.

The report identifies five areas of consensus, which together set a framework for moving forward. These are:

  • the primary importance of controlled trials to assess clinical and cost effectiveness.
  • the importance of understanding how an intervention works
  • the value of placebo or non-specific effects
  • the need for investment and collaboration in creating a sound evidence base
  • the potential for whole-system evaluation to guide decision-making and subsequent research.

The first recommendation is just great. The rest sound to me like the usual excuses for incorporating ineffective treatments into medical practice. Notice the implicit assumption in the fourth point
that spending money on research will establish “a sound evidence base". There is a precedent, but it is ignored. A huge omission from the report is that it fails to mention anywhere that a lot of research has already been done.

Much research has already been done (and failed)

The report fails to mention at all the single most important fact in this area. The US National Institutes of Health has spent over a billion dollars on research on alternative medicines, over a period
of more than 10 years. It has failed to come up with any effective treatments whatsoever. See, for example Why the National Center for Complementary and Alternative Medicine (NCCAM) Should Be Defunded;   Should there be more alternative research?;   Integrative baloney @ Yale, and most recently, $2.5B Spent, No Alternative Med Cures found. .

Why did the committee think this irrelevant? I can’t imagine. You guess.

The report says

“This report outlines areas of potential consensus to guide research funders, researchers, commissioners and complementary practitioners in developing and applying a robust evidence base for complementary practice.”

As happens so often, there is implicit in this sentence the assumption that if you spend enough money evidence will emerge. That is precisely contrary to the experence in the USA where spending a billion dollars produced nothing beyond showing that a lot of things we already thought didn’t work were indeed ineffective.

And inevitably, and tragically, NICE’s biggest mistake is invoked.

“It is noteworthy that the evidence is now sufficiently robust for NICE to include acupuncture as a treatment for low back pain.” [p ]

Did the advisory group not read the evidence used (and misinterpeted) by NICE? It seems not. Did the advisory group not read the outcome of NIH-funded studies on acupuncture as summarised by Barker Bausell in his book, Snake Oil Science? Apparently not. It’s hard to know because the report has no references.

George Lewith is quoted [p. 15] as saying “to starve the system of more knowledge means we will continue to make bad decisions”. No doubt he’d like more money for research, but if a billion dollars
in the USA gets no useful result, is Lewith really likely to do better?

The usual weasel words of the alternative medicine industry are there in abundance

“First, complementary practice often encompasses an intervention (physical treatment or manipulation) as well as the context for that intervention. Context in this setting means both the physical setting for the delivery of care and the therapeutic relationship between practitioner and patient.” [p. 12]

Yes, but ALL medicine involves the context of the treatment. This is no different whether the medicine is alternative or real. The context (or placebo) effect comes as an extra bonus with any sort of treatment.

“We need to acknowledge that much of complementary practice seeks to integrate the positive aspects of placebo and that it needs to be viewed as an integral part of the treatment rather than an aspect that should be isolated and discounted.” [p. 13]

This is interesting. It comes very close (here and elsewhere) to admitting that all you get is a placebo effect, and that this doesn’t matter. This contradicts directly the first recommendation of the House of Lords report (2000).. Both the House of Lords report on Complementary and Alternative Medicine, and the Government’s response to it, state clearly

“. . . we recommend that three important questions should be addressed in the following order”. (1) does the treatment offer therapeutic benefits greater than placebo? (2)  is the treatment safe? (3) how does it compare, in medical outcome and cost-effectiveness, with other forms of treatment?.

The crunch comes when the report gets to what we should pay for.

“Should we be prepared to pay for the so-called placebo effect?

The view of the advisory group is that it is appropriate to pay for true placebo (rather than regression to the mean or temporal effects).” [p 24]

Perhaps so, but there is very little discussion of the emormous ethical questions:that this opinion raises: 

  • How much is one allowed to lie to patients in order to elicit a placebo effect?
  • Is is OK if the practitioner believes it is a placebo but gives it anyway?
  • Is it OK if the pratitioner believes that it is not a placebo when actually it is?
  • Is it OK for practitioners to go degrees taught by people who believe that it is not a placebo when actually it is?

The report fails to face frankly these dilemmas.  The present rather absurd position in which it is considered unethical for a medical practitioner to give a patient a bottle of pink water, but
perfectly acceptable to refer them to a homeopath. There is no sign either of taking into account the cultural poison that is spread by telling people about yin, yang and meridians and such like preposterous made-up mumbo jumbo.  That is part of the cost of endorsing placebos. And just when one thought that believing things because you wished they were true was going out of fashion

Once again we hear a lot about the alleged difficulties posed by research on alternative medicine. These alleged difficulties are, in my view, mostly no more than excuses. There isn’t the slightest
difficulty in testing things like herbal medicine or homeopathy, in a way that preserves all the ‘context’ and the ways of working of homeopaths and herbalists. Anyone who reads the Guardian knows
how to do that.

In the case of acupuncture, great ingenuity has gone into divising controls. The sham and the ‘real’ acupuncture always come out the same. In a non-blind comparison between acupuncture and no acupuncture the latter usually does a bit worse, but the effects are small and transient and entirely compatible with the view that it is a theatrical placebo.

Despite these shortcomings, some of the conclusions [p. 22] are reasonable.

“The public needs more robust evidence to make informed decisions about the use of complementary practice.

Commissioners of public health care need more robust evidence on which to base decisions about expenditure of public money on complementary practice.”

What the report fails to do is to follow this with the obvious conclusion that such evidence is largely missing and that until such time as it is forthcoming there should be no question of the NHS paying for alternative treatments.

Neither should there be any question of giving them official government recognition in the form of ‘statutory regulation’. The folly of doing that is illustrated graphically by the case of chiropractic which is now in deep crisis after inspection of its claims in the wake of the Simon Singh defamation case. Osteopathy will, I expect, suffer the same fate soon.

In the summary on p.12 we see a classical case of the tension

Controlled trials of effectiveness and cost-effectiveness are of primary importance

We recognise that it is the assessment of effectiveness that is of primary importance in reaching a judgement of different practices. Producing robust evidence that something works in practice – that it is effective – should not be held up by the inevitably partial findings and challenged interpretations arising from inquiries into how the intervention works.

The headline sounds impeccable, but directly below it we see a clear statement that we should use treatments before we know whether they work.  “Effectiveness”, in the jargon of the alternative medicine business, simply means that uncontrolled trials are good enough. The bit about “how it works” is another very common red herring raised by alternative medicine people. Anyone who knows anything about pharmacology that knowledge about how any drug works is incomplete and often turns out to be wrong. That doesn’t matter a damn if it performs well in good double-blind randomised controlled trials.

One gets the impression that the whole thing would have been a lot worse without the dose of reality injected by Richard Lilford. He is quoted as a saying

“All the problems that you find in complementary medicine you will encounter in some other kind of treatment … when we stop and think about it… how different is it to any branch of health care – the answer to emerge from our debates is that it may only be a matter of degree.” [p. 17]

I take that to mean that alternative medicine poses problems that are no different from other sorts of treatment. They should be subjected to exactly the same criteria. If they fail (as is usually the case) they should be rejected.  That is exactly right.  The report was intended to produce consensus, but throughout the report, there is a scarcely hidden tension between believers on one side, and Richard Lilford’s impeccable logic on the other.

Who are the King’s Fund?

The King’s Fund is an organisation that states its aims thus.

“The King’s Fund creates and develops ideas that help shape policy, transform services and bring about behaviour change which improve health care.”

It bills this report on its home page as “New research methods needed to build evidence for the effectiveness of popular complementary therapies”. But in fact the report doesn’t really recommend ‘new research methods’ at all, just that the treatments pass the same tests as any other treatment. And note the term ‘build evidence’.  It carries the suggestion that the evidence will be positive.   Experience in the USA (and to a smaller extent in the UK) suggests that every time some good research is done, the effect is not to ‘build evidence’ but for the evidence to crumble further

If the advice is followed, and the results are largely negative, as has already happened in the USA, the Department of Health would look pretty silly if it had insisted on degrees and on statutory regulation.

The King’s Fund chairman is Sir Cyril Chantler and its Chief Executive is Niall Dickson.  It produces reports, some of which are better than this one. I know it’s hard to take seriously an organisation that wants to “share its vision” withyou, but they are trying.

“The King’s Fund was formed in 1897 as an initiative of the then Prince of Wales to allow for the collection and distribution of funds in support of the hospitals of London. Its initial purpose was to raise money for London’s voluntary hospitals,”

It seems to me that the King’s Fund is far too much too influenced by the present Prince of Wales. He is, no doubt, well-meaning but he has become a major source of medical misinformation and his influence in the Department of Health is deeply unconstitutional.  I was really surprised to see thet Cyril Chantler spoke at the 2009 conference of the Prince of Wales Foundation for Integrated Health, despite having a preview of the sort of make-believe being propagated by other speakers. His talk there struck me as evading all the essential points. Warm, woolly but in the end, a danger to patients. Not only did he uncritically fall for the spin on the word “integrated”, but he also fell for the idea that “statutory regulation” will safeguard patients.

Revelation of what is actually taught on degrees in these subjects shows very clearly that they endanger the public.

But the official mind doesn’t seem ever to look that far. It is happy ticking boxes and writing vacuous managerialese. It lacks curiosity.

Follow-up

The British Medical Journal published today an editorial which also recommends rebranding of ‘pragmatic’ trials.  No surprise there, because the editorial is written by Hugh MacPherson, senior research fellow, David Peters, professor of integrated healthcare and Catherine Zollman, general practitioner. I find it a liitle odd that the BMJ says “Competing Interests: none. David Peters interest is obvious from his job description. It is less obvious that Hugh MacPherson is an acupuncture enthusiast who publishes mostly in alternative medicine journals. He has written a book with the extraordinary title “Acupuncture Research, Strategies for Establishing an Evidence Base”. The title seems to assume that the evidence base will materialise eventually despite a great deal of work that suggests it won’t. Catherine Zollman is a GP who is into homeopathy as well as acupuncture. All three authors were speakers at the Prince of Wales conference, described at Prince of Wales Foundation for magic medicine: spin on the meaning of ‘integrated’.

The comments that follow the editorial start with an excellent contribution from James Matthew May. His distinction between ‘caring’ and ‘curing’ clarifies beautifully the muddled thinking of the editorial.

Then a comment from DC, If your treatments can’t pass the test, the test must be wrong. It concludes

“At some point a stop has to be put to this continual special pleading. The financial crisis (caused by a quite different group of people who were equally prone to wishful thinking) seems quite a good time to start.”

Jump to follow-up

This article has been reposted on The Winnower, and now has a digital object identifier DOI: 10.15200/winn.142934.47856

This post is not about quackery, nor university politics.  It is about inference,  How do we know what we should eat?  The question interests everyone, but what do we actually know?  Not as much as you might think from the number of column-inches devoted to the topic.  The discussion below is a synopsis of parts of an article called “In praise of randomisation”, written as a contribution to a forthcoming book, Evidence, Inference and Enquiry.  

About a year ago just about every newspaper carried a story much like this one in the Daily Telegraph,

Sausage a day can increase bowel cancer risk

By Rebecca Smith, Medical Editor Last Updated: 1:55AM BST 31/03/2008

Eating one sausage or three rashers of bacon a day can increase the risk of bowel cancer by a fifth, a medical expert has said.


The warning involved only 1.8oz (50g) of processed meat daily.

It recommended that people eat less than 17.6 oz of cooked red meat a week and avoid all processed meat.

Researchers found that almost half of cancers could be prevented with lifestyle changes such as a healthier diet, using sunscreen, not smoking and limiting alcohol intake.


What, I wondered, was the evidence behind these dire warnings.   They did not come from a lifestyle guru, a diet faddist or a supplement salesman. This is nothing to do with quackery. The numbers come from the 2007 report of the World Cancer Research Fund and American Institute for Cancer Research, with the title ‘Food, Nutrition, Physical Activity, and the Prevention of Cancer: a Global Perspective‘. This is a 537 page report with over 4,400 references. Its panel was chaired by Professor Sir Michael Marmot, UCL’s professor of Epidemiology and Public Health. He is a distinguished epidemiologist, renowned for his work on the relation between poverty and health.

Nevertheless there has never been a randomised trial to test the carcinogenicity of bacon, so it seems reasonable to ask how strong is the evidence that you shouldn’t eat it?  It turns out to be surprisingly flimsy.

In praise of randomisation

Everyone knows about the problem of causality in principle. Post hoc ergo propter hoc; confusion of sequence and consequence; confusion of correlation and cause. This is not a trivial problem. It is probably the main reason why ineffective treatments often appear to work. It is traded on by the vast and unscrupulous alternative medicine industry. It is, very probably, the reason why we are bombarded every day with conflicting advice on what to eat. This is a bad thing, for two reasons. First, we end up confused about what we should eat. But worse still, the conflicting nature of the advice gives science as a whole a bad reputation. Every time a white-coated scientist appears in the media to tell us that a glass of wine per day is good/bad for us (delete according to the phase of the moon) the general public just laugh.

In the case of sausages and bacon, suppose that there is a correlation between eating them and developing colorectal cancer. How do we know that it was eating the bacon that caused the cancer – that the relationship is causal?  The answer is that there is no way to be sure if we have simply observed the association.  It could always be that the sort of people who eat bacon are also the sort of people who get colorectal cancer.  But the question of causality is absolutely crucial, because if it is not causal, then stopping eating bacon won’t reduce your risk of cancer.  The recommendation to avoid all processed meat in the WCRF report (2007) is sensible only if the relationship is causal. Barker Bausell said:

[Page39] “But why should nonscientists care one iota about something as esoteric as causal inference? I believe that the answer to this question is because the making of causal inferences is part of our job description as Homo Sapiens.”

That should be the mantra of every health journalist, and every newspaper reader.

The essential basis for causal inference was established over 70 years ago by that giant of statistics Ronald Fisher, and that basis is randomisation. Its first popular exposition was in Fisher’s famous book, The Design of Experiments (1935).  The Lady Tasting Tea has become the classical example of how to design an experiment.  .

Briefly, a lady claims to be able to tell whether the milk was put in the cup before or after the tea was poured.  Fisher points out that to test this you need to present the lady with an equal number of cups that are ‘milk first’ or ‘tea first’ (but otherwise indistinguishable) in random order, and count how many she gets right.  There is a beautiful analysis of it in Stephen Senn’s book, Dicing with Death: Chance, Risk and Health. As it happens, Google books has the whole of the relevant section Fisher’s tea test (geddit?), but buy the book anyway.  Such is the fame of this example that it was used as the title of a book, The Lady Tasting Tea was published by David Salsburg (my review of it is here)

Most studies of diet and health fall into one of three types, case-control studies, cohort (or prospective) studies, or randomised controlled trials (RCTs). Case-control studies are the least satisfactory: they look at people who already have the disease and look back to see how they differ from similar people who don’t have the disease. They are retrospective.  Cohort studies are better because they are prospective: a large group of people is followed for a long period and their health and diet is recorded and later their disease and death is recorded.  But in both sorts of studies,each person decides for him/herself what to eat or what drugs to take.  Such studies can never demonstrate causality, though if the effect is really big (like cigarette-smoking and lung cancer) they can give a very good indication. The difference in an RCT is that each person does not choose what to eat, but their diet is allocated randomly to them by someone else. This means that, on average, all other factors that might influence the response are balanced equally between the two groups. Only RCTs can demonstrate causality.

Randomisation is a rather beautiful idea. It allows one to remove, in a statistical sense, bias that might result from all the sources that you hadn’t realised were there. If you are aware of a source of bias, then measure it. The danger arises from the things you don’t know about, or can’t measure (Senn, 2004; Senn, 2003). Although it guarantees freedom from bias only in a long run statistical sense, that is the best that can be done. Everything else is worse.

Ben Goldacre has referred memorably to the newspapers’ ongoing “Sisyphean task of dividing all the inanimate objects in the world into the ones that either cause or cure cancer” (Goldacre, 2008). This has even given rise to a blog. “The Daily Mail Oncological Ontology Project“. The problem arises in assessing causality.

It wouldn’t be so bad if the problem were restricted to the media. It is much more worrying that the problem of establishing causality often seems to be underestimated by the authors of papers themselves. It is a matter of speculation why this happens. Part of the reason is, no doubt, a genuine wish to discover something that will benefit mankind. But it is hard not to think that hubris and self-promotion may also play a role. Anything whatsoever that purports to relate diet to health is guaranteed to get uncritical newspaper headlines.

At the heart of the problem lies the great difficulty in doing randomised studies of the effect of diet and health. There can be no better illustration of the vital importance of randomisation than in this field. And, notwithstanding the generally uncritical reporting of stories about diet and health, one of the best accounts of the need for randomisation was written by a journalist, Gary Taubes, and it appeared in the New York Times (Taubes, 2007).

The case of hormone replacement therapy

In the 1990s hormone replacement therapy (HRT) was recommended not only to relieve the unpleasant symptoms of the menopause, but also because cohort studies suggested that HRT would reduce heart disease and osteoporosis in older women. For these reasons, by 2001, 15 million US women (perhaps 5 million older women) were taking HRT (Taubes, 2007). These recommendations were based largely on the Harvard Nurses’ Study. This was a prospective cohort study in which 122,000 nurses were followed over time, starting in 1976 (these are the ones who responded out of the 170,000 requests sent out). In 1994, it was said (Manson, 1994) that nearly all of the more than 30 observational studies suggested a reduced risk of coronary heart disease (CHD) among women receiving oestrogen therapy. A meta-analysis gave an estimated 44% reduction of CHD. Although warnings were given about the lack of randomised studies, the results were nevertheless acted upon as though they were true. But they were wrong. When proper randomised studies were done, not only did it turn out that CHD was not reduced: it was actually increased.

The Women’s Health Initiative Study (Rossouw et al., 2002) was a randomized double blind trial on 16,608 postmenopausal women aged 50-79 years and its results contradicted the conclusions from all the earlier cohort studies.  HRT increased risks of heart disease, stroke, blood clots, breast cancer (though possibly helped with osteoporosis and perhaps colorectal cancer). After an average 5.2 years of follow-up, the trial was stopped because of the apparent increase in breast cancer in the HRT group. The relative risk (HRT relative to placebo) of CHD was 1.29 (95% confidence interval 1.02 to 1.63) (286 cases altogether) and for breast cancer 1.26 (1.00 -1.59) (290 cases). Rather than there being a 44% reduction of risk, it seems that there was actually a 30% increase in risk. Notice that these are actually quite small risks, and on the margin of statistical significance. For the purposes of communicating the nature of the risk to an individual person it is usually better to specify the absolute risk rather than relative risk. The absolute number of CHD cases per 10,000 person-years is about 29 on placebo and 36 on HRT, so the increased risk of any individual is quite small. Multiplied over the whole population though, the number is no longer small.

Several plausible reasons for these contradictory results are discussed by Taubes,(2007): it seems that women who choose to take HRT are healthier than those who don’t. In fact the story has become a bit more complicated since then: the effect of HRT depends on when it is started and on how long it is taken (Vandenbroucke, 2009).

This is perhaps one of the most dramatic illustrations of the value of randomised controlled trials (RCTs). Reliance on observations of correlations suggested a 44% reduction in CHD, the randomised trial gave a 30% increase in CHD. Insistence on randomisation is not just pedantry. It is essential if you want to get the right answer.

Having dealt with the cautionary tale of HRT, we can now get back to the ‘Sisyphean task of dividing all the inanimate objects in the world into the ones that either cause or cure cancer’.

The case of processed meat

The WCRF report (2007) makes some pretty firm recommendations.

  • Don’t get overweight
  • Be moderately physically active, equivalent to brisk walking for at least 30 minutes every day
  • Consume energy-dense foods sparingly. Avoid sugary drinks. Consume ‘fast foods’ sparingly, if at all
  • Eat at least five portions/servings (at least 400 g or 14 oz) of a variety of non-starchy vegetables and of fruits every day. Eat relatively unprocessed cereals (grains) and/or pulses (legumes) with every meal. Limit refined starchy foods
  • People who eat red meat to consume less than 500 g (18 oz) a week, very little if any to be processed.
  • If alcoholic drinks are consumed, limit consumption to no more than two drinks a day for men and one drink a day for women.
  • Avoid salt-preserved, salted, or salty foods; preserve foods without using salt. Limit consumption of processed foods with added salt to ensure an intake of less than 6 g (2.4 g sodium) a day.
  • Dietary supplements are not recommended for cancer prevention.

These all sound pretty sensible but they are very prescriptive. And of course the recommendations make sense only insofar as the various dietary factors cause cancer. If the association is not causal, changing your diet won’t help. Note that dietary supplements are NOT recommended. I’ll concentrate on the evidence that lies behind “People who . . . very little if any to be processed.”

The problem of establishing causality is dicussed in the report in detail. In section 3.4 the report says

” . . . causal relationships between food and nutrition, and physical activity can be confidently inferred when epidemiological evidence, and experimental and other biological findings, are consistent, unbiased, strong, graded, coherent, repeated, and plausible.”

The case of processed meat is dealt with in chapter 4.3 (p. 148) of the report.

“Processed meats” include sausages and frankfurters, and ‘hot dogs’, to which nitrates/nitrites or other preservatives are added, are also processed meats. Minced meats sometimes, but not always, fall inside this definition if they are preserved chemically. The same point applies to ‘hamburgers’.

The evidence for harmfulness of processed meat was described as “convincing”, and this is the highest level of confidence in the report, though this conclusion has been challenged (Truswell, 2009) .

How well does the evidence obey the criteria for the relationship being causal?

Twelve prospective cohort studies showed increased risk for the highest intake group when compared to the lowest, though this was statistically significant in only three of them. One study reported non-significant decreased risk and one study reported that there was no effect on risk. These results are summarised in this forest plot (see also Lewis & Clark, 2001)

Each line represents a separate study. The size of the square represents the precision (weight) for each. The horizontal bars show the 95% confidence intervals. If it were possible to repeat the observations many times on the same population, the 95% CL would be different on each repeat experiment, but 19 out of 20 (95%) of the intervals would contain the true value (and 1 in 20 would not contain the true value).  If the bar does not overlap the vertical line at relative risk = 1 (i.e. no effect) this is equivalent to saying that there is a statistically significant difference from 1 with P < 0.05.  That means, very roughly, that there is a 1 in 20 chance of making a fool of yourself if you claim that the association is real, rather than being a result of chance (more detail here),

There is certainly a tendency for the relative risks to be above one, though not by much,  Pooling the results sounds like a good idea. The method for doing this is called meta-analysis .

Meta-analysis was possible on five studies, shown below. The outcome is shown by the red diamond at the bottom, labelled “summary effect”, and the width of the diamond indicates the 95% confidence interval. In this case the final result for association between processed meat intake and colorectal cancer was a relative risk of 1.21 (95% CI 1.04–1.42) per 50 g/day. This is presumably where the headline value of a 20% increase in risk came from.

Support came from a meta-analysis of 14 cohort studies, which reported a relative risk for processed meat of 1.09 (95% CI 1.05 – 1.13) per 30 g/day (Larsson & Wolk, 2006). Since then another study has come up with similar numbers (Sinha etal. , 2009). This consistency suggests a real association, but it cannot be taken as evidence for causality.   Observational studies on HRT were just as consistent, but they were wrong.

The accompanying editorial (Popkin, 2009) points out that there are rather more important reasons to limit meat consumption, like the environmental footprint of most meat production, water supply, deforestation and so on.

So the outcome from vast numbers of observations is an association that only just reaches the P = 0.05 level of statistical significance. But even if the association is real, not a result of chance sampling error, that doesn’t help in the least in establishing causality.

There are two more criteria that might help, a good relationship between dose and response, and a plausible mechanism.

Dose – response relationship

It is quite possible to observe a very convincing relationship between dose and response in epidemiological studies,  The relationship between number of cigarettes smoked per day and the incidence of lung cancer is one example.  Indeed it is almost the only example.



Doll & Peto, 1978

There have been six studies that relate consumption of processed meat to incidence of colorectal cancer. All six dose-response relationships are shown in the WCRG report. Here they are.

This Figure was later revised to

This is the point where my credulity begins to get strained.  Dose – response curves are part of the stock in trade of pharmacologists.  The technical description of these six curves is, roughly, ‘bloody horizontal’.  The report says “A dose-response relationship was also apparent from cohort studies that measured consumption in times/day”. I simply cannot agree that any relationship whatsoever is “apparent”.

They are certainly the least convincing dose-response relationships I have ever seen. Nevertheless a meta-analysis came up with a slope for response curve that just reached the 5% level of statistical significance.

The conclusion of the report for processed meat and colorectal cancer was as follows.

“There is a substantial amount of evidence, with a dose-response relationship apparent from cohort studies. There is strong evidence for plausible mechanisms operating in humans. Processed meat is a convincing cause of colorectal cancer.”

But the dose-response curves look appalling, and it is reasonable to ask whether public policy should be based on a 1 in 20 chance of being quite wrong (1 in 20 at best –see Senn, 2008). I certainly wouldn’t want to risk my reputation on odds like that, never mind use it as a basis for public policy.

So we are left with plausibility as the remaining bit of evidence for causality. Anyone who has done much experimental work knows that it is possible to dream up a plausible explanation of any result whatsoever. Most are wrong and so plausibility is a pretty weak argument. Much play is made of the fact that cured meats contain nitrates and nitrites, but there is no real evidence that the amount they contain is harmful.

The main source of nitrates in the diet is not from meat but from vegetables (especially green leafy vegetables like lettuce and spinach) which contribute 70 – 90% of total intake. The maximum legal content in processed meat is 10 – 25 mg/100g, but lettuce contains around 100 – 400 mg/100g with a legal limit of 200 – 400 mg/100g. Dietary nitrate intake was not associated with risk for colorectal cancer in two cohort studies.(Food Standards Agency, 2004; International Agency for Research on Cancer, 2006).

To add further to the confusion, another cohort study on over 60,000 people compared vegetarians and meat-eaters. Mortality from circulatory diseases and mortality from all causes were not detectably different between vegetarians and meat eaters (Key et al., 2009a). Still more confusingly, although the incidence of all cancers combined was lower among vegetarians than among meat eaters, the exception was colorectal cancer which had a higher incidence in vegetarians than in meat eaters (Key et al., 2009b).

Mente et al. (2009) compared cohort studies and RCTs for effects of diet on risk of coronary heart disease. “Strong evidence” for protective effects was found for intake of vegetables, nuts, and “Mediterranean diet”, and harmful effects of intake of trans–fatty acids and foods with a high glycaemic index. There was also a bit less strong evidence for effects of mono-unsaturated fatty acids and for intake of fish, marine ω-3 fatty acids, folate, whole grains, dietary vitamins E and C, beta carotene, alcohol, fruit, and fibre.  But RCTs showed evidence only for “Mediterranean diet”, and for none of the others.

As a final nail in the coffin of case control studies, consider pizza. According to La Vecchia & Bosetti (2006), data from a series of case control studies in northern Italy lead to: “An inverse association was found between regular pizza consumption (at least one portion of pizza per week) and the risk of cancers of the digestive tract, with relative risks of 0.66 for oral and pharyngeal cancers, 0.41 for oesophageal, 0.82 for laryngeal, 0.74 for colon and 0.93 for rectal cancers.”

What on earth is one meant to make of this?    Pizza should be prescribable on the National Health Service to produce a 60% reduction in oesophageal cancer?   As the authors say “pizza may simply represent a general and aspecific indicator of a favourable Mediterranean diet.” It is observations like this that seem to make a mockery of making causal inferences from non-randomised studies. They are simply uninterpretable.

Is the observed association even real?

The most noticeable thing about the effects of red meat and processed meat is not only that they are small but also that they only just reach the 5 percent level of statistical significance. It has been explained clearly why, in these circumstances, real associations are likely to be exaggerated in size (Ioannidis, 2008a; Ioannidis, 2008b; Senn, 2008). Worse still, there as some good reasons to think that many (perhaps even most) of the effects that are claimed in this sort of study are not real anyway (Ioannidis, 2005). The inflation of the strength of associations is expected to be bigger in small studies, so it is noteworthy that the large meta-analysis by Larsson & Wolk, 2006 comments “In the present meta-analysis, the magnitude of the relationship of processed meat consumption with colorectal cancer risk was weaker than in the earlier meta-analyses”.

This is all consistent with the well known tendency of randomized clinical trials to show initially a good effect of treatment but subsequent trials tend to show smaller effects. The reasons, and the cures, for this worrying problem are discussed by Chalmers (Chalmers, 2006; Chalmers & Matthews, 2006; Garattini & Chalmers, 2009)

What do randomized studies tell us?

The only form of reliable evidence for causality comes from randomised controlled trials. The difficulties in allocating people to diets over long periods of time are obvious and that is no doubt one reason why there are far fewer RCTs than there are observational studies. But when they have been done the results often contradict those from cohort studies. The RCTs of hormone replacement therapy mentioned above contradicted the cohort studies and reversed the advice given to women about HRT.

Three more illustrations of how plausible suggestions about diet can be refuted by RCTs concern nutritional supplements and weight-loss diets

Many RCTs have shown that various forms of nutritional supplement do no good and may even do harm (see Cochrane reviews). At least we now know that anti-oxidants per se do you no good. The idea that anti-oxidants might be good for you was never more than a plausible hypothesis, and like so many plausible hypotheses it has turned out to be a myth. The word anti-oxidant is now no more than a marketing term, though it remains very profitable for unscrupulous salesmen.

The randomised Women’s Health Initiative Dietary Modification Trial (Prentice et al., 2007; Prentice, 2007) showed minimal effects of dietary fat on cancer, though the conclusion has been challenged on the basis of the possible inaccuracy of reported diet (Yngve et al., 2006).

Contrary to much dogma about weight loss (Sacks et al., 2009) found no differences in weight loss over two years between four very different diets. They assigned randomly 811 overweight adults to one of four diets. The percentages of energy derived from fat, protein, and carbohydrates in the four diets were 20, 15, and 65%; 20, 25, and 55%; 40, 15, and 45%; and 40, 25, and 35%. No difference could be detected between the different diets: all that mattered for weight loss was the total number of calories. It should be added, though, that there were some reasons to think that the participants may not have stuck to their diets very well (Katan, 2009).

The impression one gets from RCTs is that the details of diet are not anything like as important as has been inferred from non-randomised observational studies.

So does processed meat give you cancer?

After all this, we can return to the original question. Do sausages or bacon give you colorectal cancer? The answer, sadly, is that nobody really knows. I do know that, on the basis of the evidence, it seems to me to be an exaggeration to assert that “The evidence is convincing that processed meat is a cause of bowel cancer”.

In the UK there were around 5 cases of colorectal cancer per 10,000 population in 2005, so a 20% increase, even if it were real, and genuinely causative. would result in 6 rather than 5 cases per 10,000 people, annually. That makes the risk sound trivial for any individual. On the other hand there were 36,766 cases of colorectal cancer in the UK in 2005. A 20% increase would mean, if the association were causal, about 7000 extra cases as a result of eating processed meat, but no extra cases if the association were not causal.

For the purposes of public health policy about diet, the question of causality is crucial. One has sympathy for the difficult decisions that they have to make, because they are forced to decide on the basis of inadequate evidence.

If it were not already obvious, the examples discussed above make it very clear that the only sound guide to causality is a properly randomised trial. The only exceptions to that are when effects are really big. The relative risk of lung cancer for a heavy cigarette smoker is 20 times that of a non-smoker and there is a very clear relationship between dose (cigarettes per day) and response (lung cancer incidence), as shown above. That is a 2000% increase in risk, very different from the 20% found for processed meat (and many other dietary effects). Nobody could doubt seriously the causality in that case.

The decision about whether to eat bacon and sausages has to be a personal one. It depends on your attitude to the precautionary principle. The observations do not, in my view, constitute strong evidence for causality, but they are certainly compatible with causality. It could be true so if you want to be on the safe side then avoid bacon.  Of course life would not be much fun if your actions were based on things that just could be true.
 

My own inclination would be to ignore any relative risk based on observational data if it was less than about 2. The National Cancer Institute (Nelson, 2002) advises that relative risks less than 2 should be “viewed with caution”, but fails to explain what “viewing with caution” means in practice, so the advice isn’t very useful.

In fact hardly any of the relative risks reported in the WCRF report (2007) reach this level. Almost all relative risks are less than 1.3 (or greater than 0.7 for alleged protective effects). Perhaps it is best to stop worrying and get on with your life. At some point it becomes counterproductive to try to micromanage `people’s diet on the basis of dubious data. There is a price to pay for being too precautionary. It runs the risk of making people ignore information that has got a sound basis. It runs the risk of excessive medicalisation of everyday life. And it brings science itself into disrepute when people laugh at the contradictory findings of observational epidemiology.

The question of how diet and other ‘lifestyle interventions’ affect health is fascinating to everyone. There is compelling reason to think that it matters. For example one study demonstrated that breast cancer incidence increased almost threefold in first-generation Japanese women who migrated to Hawaii, and up to fivefold in the second generation (Kolonel, 1980). Since then enormous effort has been put into finding out why. The first great success was cigarette smoking but that is almost the only major success. Very few similar magic bullets have come to light after decades of searching (asbestos and mesothelioma, or UV radiation and skin cancer count as successes).

The WCRF report (2007) has 537 pages and over 4400 references and we still don’t know.

Sometimes I think we should say “I don’t know” rather more often.

More material

  • Listen to Ben Goldacre’s Radio 4 programmes. The Rise of the Lifetsyle Nutritionists. Part 1 and Part 2 (mp3 files), and at badscience.net.

  • Risk  The Science and Politics of Fear,  Dan Gardner. Virgin
    Books, 2008

  • Some bookmarks about diet and supplements

Follow up

Dan Gardner, the author of Risk, seems to like the last line at least, according to his blog.

Report of the update, 2010

The 2010 report has been updated in WCRF/AICR Systematic Literature Review Continuous Update Project Report [big pdf file]. This includes studies up to May/June 2010.

The result of addition of the new data was to reduce slightly the apparent risk from eating processed meat from 1.21 (95% CI = 1.04-1.42) in the original study to 1.18 (95% CI = 1.10-1.28) in the update. The change is too small to mean much, though it is in direction expected for false correlations. More importantly, the new data confirm that the dose-response curves are pathetic. The evidence for causality is weakened somewhat by addition of the new data.

Dose-response graph of processed meat and colorectal cancer

update 2010