“Statistical regression to the mean predicts that patients selected for abnormalcy will, on the average, tend to improve. We argue that most improvements attributed to the placebo effect are actually instances of statistical regression.”
“Thus, we urge caution in interpreting patient improvements as causal effects of our actions and should avoid the conceit of assuming that our personal presence has strong healing powers.”
In 1955, Henry Beecher published "The Powerful Placebo". I was in my second undergraduate year when it appeared. And for many decades after that I took it literally, They looked at 15 studies and found that an average 35% of them got "satisfactory relief" when given a placebo. This number got embedded in pharmacological folk-lore. He also mentioned that the relief provided by placebo was greatest in patients who were most ill.
Consider the common experiment in which a new treatment is compared with a placebo, in a double-blind randomised controlled trial (RCT). It’s common to call the responses measured in the placebo group the placebo response. But that is very misleading, and here’s why.
The responses seen in the group of patients that are treated with placebo arise from two quite different processes. One is the genuine psychosomatic placebo effect. This effect gives genuine (though small) benefit to the patient. The other contribution comes from the get-better-anyway effect. This is a statistical artefact and it provides no benefit whatsoever to patients. There is now increasing evidence that the latter effect is much bigger than the former.
How can you distinguish between real placebo effects and get-better-anyway effect?
The only way to measure the size of genuine placebo effects is to compare in an RCT the effect of a dummy treatment with the effect of no treatment at all. Most trials don’t have a no-treatment arm, but enough do that estimates can be made. For example, a Cochrane review by Hróbjartsson & Gøtzsche (2010) looked at a wide variety of clinical conditions. Their conclusion was:
“We did not find that placebo interventions have important clinical effects in general. However, in certain settings placebo interventions can influence patient-reported outcomes, especially pain and nausea, though it is difficult to distinguish patient-reported effects of placebo from biased reporting.”
In some cases, the placebo effect is barely there at all. In a non-blind comparison of acupuncture and no acupuncture, the responses were essentially indistinguishable (despite what the authors and the journal said). See "Acupuncturists show that acupuncture doesn’t work, but conclude the opposite"
So the placebo effect, though a real phenomenon, seems to be quite small. In most cases it is so small that it would be barely perceptible to most patients. Most of the reason why so many people think that medicines work when they don’t isn’t a result of the placebo response, but it’s the result of a statistical artefact.
Regression to the mean is a potent source of deception
The get-better-anyway effect has a technical name, regression to the mean. It has been understood since Francis Galton described it in 1886 (see Senn, 2011 for the history). It is a statistical phenomenon, and it can be treated mathematically (see references, below). But when you think about it, it’s simply common sense.
You tend to go for treatment when your condition is bad, and when you are at your worst, then a bit later you’re likely to be better, The great biologist, Peter Medawar comments thus.
"If a person is (a) poorly, (b) receives treatment intended to make him better, and (c) gets better, then no power of reasoning known to medical science can convince him that it may not have been the treatment that restored his health"
(Medawar, P.B. (1969:19). The Art of the Soluble: Creativity and originality in science. Penguin Books: Harmondsworth).
This is illustrated beautifully by measurements made by McGorry et al., (2001). Patients with low back pain recorded their pain (on a 10 point scale) every day for 5 months (they were allowed to take analgesics ad lib).
The results for four patients are shown in their Figure 2. On average they stay fairly constant over five months, but they fluctuate enormously, with different patterns for each patient. Painful episodes that last for 2 to 9 days are interspersed with periods of lower pain or none at all. It is very obvious that if these patients had gone for treatment at the peak of their pain, then a while later they would feel better, even if they were not actually treated. And if they had been treated, the treatment would have been declared a success, despite the fact that the patient derived no benefit whatsoever from it. This entirely artefactual benefit would be the biggest for the patients that fluctuate the most (e.g this in panels a and d of the Figure).
Figure 2 from McGorry et al, 2000. Examples of daily pain scores over a 6-month period for four participants. Note: Dashes of different lengths at the top of a figure designate an episode and its duration.
The effect is illustrated well by an analysis of 118 trials of treatments for non-specific low back pain (NSLBP), by Artus et al., (2010). The time course of pain (rated on a 100 point visual analogue pain scale) is shown in their Figure 2. There is a modest improvement in pain over a few weeks, but this happens regardless of what treatment is given, including no treatment whatsoever.
FIG. 2 Overall responses (VAS for pain) up to 52-week follow-up in each treatment arm of included trials. Each line represents a response line within each trial arm. Red: index treatment arm; Blue: active treatment arm; Green: usual care/waiting list/placebo arms. ____: pharmacological treatment; – – – -: non-pharmacological treatment; . . .. . .: mixed/other.
The authors comment
"symptoms seem to improve in a similar pattern in clinical trials following a wide variety of active as well as inactive treatments.", and "The common pattern of responses could, for a large part, be explained by the natural history of NSLBP".
In other words, none of the treatments work.
This paper was brought to my attention through the blog run by the excellent physiotherapist, Neil O’Connell. He comments
"If this finding is supported by future studies it might suggest that we can’t even claim victory through the non-specific effects of our interventions such as care, attention and placebo. People enrolled in trials for back pain may improve whatever you do. This is probably explained by the fact that patients enrol in a trial when their pain is at its worst which raises the murky spectre of regression to the mean and the beautiful phenomenon of natural recovery."
O’Connell has discussed the matter in recent paper, O’Connell (2015), from the point of view of manipulative therapies. That’s an area where there has been resistance to doing proper RCTs, with many people saying that it’s better to look at “real world” outcomes. This usually means that you look at how a patient changes after treatment. The hazards of this procedure are obvious from Artus et al.,Fig 2, above. It maximises the risk of being deceived by regression to the mean. As O’Connell commented
"Within-patient change in outcome might tell us how much an individual’s condition improved, but it does not tell us how much of this improvement was due to treatment."
In order to eliminate this effect it’s essential to do a proper RCT with control and treatment groups tested in parallel. When that’s done the control group shows the same regression to the mean as the treatment group. and any additional response in the latter can confidently attributed to the treatment. Anything short of that is whistling in the wind.
Needless to say, the suboptimal methods are most popular in areas where real effectiveness is small or non-existent. This, sad to say, includes low back pain. It also includes just about every treatment that comes under the heading of alternative medicine. Although these problems have been understood for over a century, it remains true that
"It is difficult to get a man to understand something, when his salary depends upon his not understanding it."
Upton Sinclair (1935)
Responders and non-responders?
One excuse that’s commonly used when a treatment shows only a small effect in proper RCTs is to assert that the treatment actually has a good effect, but only in a subgroup of patients ("responders") while others don’t respond at all ("non-responders"). For example, this argument is often used in studies of anti-depressants and of manipulative therapies. And it’s universal in alternative medicine.
There’s a striking similarity between the narrative used by homeopaths and those who are struggling to treat depression. The pill may not work for many weeks. If the first sort of pill doesn’t work try another sort. You may get worse before you get better. One is reminded, inexorably, of Voltaire’s aphorism "The art of medicine consists in amusing the patient while nature cures the disease".
There is only a handful of cases in which a clear distinction can be made between responders and non-responders. Most often what’s observed is a smear of different responses to the same treatment -and the greater the variability, the greater is the chance of being deceived by regression to the mean.
For example, Thase et al., (2011) looked at responses to escitalopram, an SSRI antidepressant. They attempted to divide patients into responders and non-responders. An example (Fig 1a in their paper) is shown.
The evidence for such a bimodal distribution is certainly very far from obvious. The observations are just smeared out. Nonetheless, the authors conclude
"Our findings indicate that what appears to be a modest effect in the grouped data – on the boundary of clinical significance, as suggested above – is actually a very large effect for a subset of patients who benefited more from escitalopram than from placebo treatment. "
I guess that interpretation could be right, but it seems more likely to be a marketing tool. Before you read the paper, check the authors’ conflicts of interest.
The bottom line is that analyses that divide patients into responders and non-responders are reliable only if that can be done before the trial starts. Retrospective analyses are unreliable and unconvincing.
Some more reading
Senn, 2011 provides an excellent introduction (and some interesting history). The subtitle is
"Here Stephen Senn examines one of Galton’s most important statistical legacies – one that is at once so trivial that it is blindingly obvious, and so deep that many scientists spend their whole career being fooled by it."
The examples in this paper are extended in Senn (2009), “Three things that every medical writer should know about statistics”. The three things are regression to the mean, the error of the transposed conditional and individual response.
You can read slightly more technical accounts of regression to the mean in McDonald & Mazzuca (1983) "How much of the placebo effect is statistical regression" (two quotations from this paper opened this post), and in Stephen Senn (2015) "Mastering variation: variance components and personalised medicine". In 1988 Senn published some corrections to the maths in McDonald (1983).
The trials that were used by Hróbjartsson & Gøtzsche (2010) to investigate the comparison between placebo and no treatment were looked at again by Howick et al., (2013), who found that in many of them the difference between treatment and placebo was also small. Most of the treatments did not work very well.
Regression to the mean is not just a medical deceiver: it’s everywhere
Although this post has concentrated on deception in medicine, it’s worth noting that the phenomenon of regression to the mean can cause wrong inferences in almost any area where you look at change from baseline. A classical example concern concerns the effectiveness of speed cameras. They tend to be installed after a spate of accidents, and if the accident rate is particularly high in one year it is likely to be lower the next year, regardless of whether a camera had been installed or not. To find the true reduction in accidents caused by installation of speed cameras, you would need to choose several similar sites and allocate them at random to have a camera or no camera. As in clinical trials. looking at the change from baseline can be very deceptive.
Lastly, remember that it you avoid all of these hazards of interpretation, and your test of significance gives P = 0.047. that does not mean you have discovered something. There is still a risk of at least 30% that your ‘positive’ result is a false positive. This is explained in Colquhoun (2014),"An investigation of the false discovery rate and the misinterpretation of p-values". I’ve suggested that one way to solve this problem is to use different words to describe P values: something like this.
P > 0.05 very weak evidence
P = 0.05 weak evidence: worth another look
P = 0.01 moderate evidence for a real effect
P = 0.001 strong evidence for real effect
But notice that if your hypothesis is implausible, even these criteria are too weak. For example, if the treatment and placebo are identical (as would be the case if the treatment were a homeopathic pill) then it follows that 100% of positive tests are false positives.
12 December 2015
It’s worth mentioning that the question of responders versus non-responders is closely-related to the classical topic of bioassays that use quantal responses. In that field it was assumed that each participant had an individual effective dose (IED). That’s reasonable for the old-fashioned LD50 toxicity test: every animal will die after a sufficiently big dose. It’s less obviously right for ED50 (effective dose in 50% of individuals). The distribution of IEDs is critical, but it has very rarely been determined. The cumulative form of this distribution is what determines the shape of the dose-response curve for fraction of responders as a function of dose. Linearisation of this curve, by means of the probit transformation used to be a staple of biological assay. This topic is discussed in Chapter 10 of Lectures on Biostatistics. And you can read some of the history on my blog about Some pharmacological history: an exam from 1959.
In the course of thinking about metrics, I keep coming across cases of over-promoted research. An early case was “Why honey isn’t a wonder cough cure: more academic spin“. More recently, I noticed these examples.
“Effect of Vitamin E and Memantine on Functional Decline in Alzheimer Disease".(Spoiler -very little), published in the Journal of the American Medical Association. ”
and ” Primary Prevention of Cardiovascular Disease with a Mediterranean Diet” , in the New England Journal of Medicine (which had second highest altmetric score in 2013)
and "Sleep Drives Metabolite Clearance from the Adult Brain", published in Science
In all these cases, misleading press releases were issued by the journals themselves and by the universities. These were copied out by hard-pressed journalists and made headlines that were certainly not merited by the work. In the last three cases, hyped up tweets came from the journals. The responsibility for this hype must eventually rest with the authors. The last two papers came second and fourth in the list of highest altmetric scores for 2013
Here are to two more very recent examples. It seems that every time I check a highly tweeted paper, it turns out that it is very second rate. Both papers involve fMRI imaging, and since the infamous dead salmon paper, I’ve been a bit sceptical about them. But that is irrelevant to what follows.
Boost your memory with electricity
That was a popular headline at the end of August. It referred to a paper in Science magazine:
“Targeted enhancement of cortical-hippocampal brain networks and associative memory” (Wang, JX et al, Science, 29 August, 2014)
This study was promoted by the Northwestern University "Electric current to brain boosts memory". And Science tweeted along the same lines.
Science‘s link did not lead to the paper, but rather to a puff piece, "Rebooting memory with magnets". Again all the emphasis was on memory, with the usual entirely speculative stuff about helping Alzheimer’s disease. But the paper itself was behind Science‘s paywall. You couldn’t read it unless your employer subscribed to Science.
All the publicity led to much retweeting and a big altmetrics score. Given that the paper was not open access, it’s likely that most of the retweeters had not actually read the paper.
When you read the paper, you found that is mostly not about memory at all. It was mostly about fMRI. In fact the only reference to memory was in a subsection of Figure 4. This is the evidence.
That looks desperately unconvincing to me. The test of significance gives P = 0.043. In an underpowered study like this, the chance of this being a false discovery is probably at least 50%. A result like this means, at most, "worth another look". It does not begin to justify all the hype that surrounded the paper. The journal, the university’s PR department, and ultimately the authors, must bear the responsibility for the unjustified claims.
Science does not allow online comments following the paper, but there are now plenty of sites that do. NHS Choices did a fairly good job of putting the paper into perspective, though they failed to notice the statistical weakness. A commenter on PubPeer noted that Science had recently announced that it would tighten statistical standards. In this case, they failed. The age of post-publication peer review is already reaching maturity
Boost your memory with cocoa
"Enhancing dentate gyrus function with dietary flavanols improves cognition in older adults. Brickman et al., Nat Neurosci. 2014. doi: 10.1038/nn.3850.".
The journal helpfully lists no fewer that 89 news items related to this study. Mostly they were something like “Drinking cocoa could improve your memory” (Kat Lay, in The Times). Only a handful of the 89 reports spotted the many problems.
A puff piece from Columbia University’s PR department quoted the senior author, Dr Small, making the dramatic claim that
“If a participant had the memory of a typical 60-year-old at the beginning of the study, after three months that person on average had the memory of a typical 30- or 40-year-old.”
Like anything to do with diet, the paper immediately got circulated on Twitter. No doubt most of the people who retweeted the message had not read the (paywalled) paper. The links almost all led to inaccurate press accounts, not to the paper itself.
But some people actually read the paywalled paper and post-publication review soon kicked in. Pubmed Commons is a good site for that, because Pubmed is where a lot of people go for references. Hilda Bastian kicked off the comments there (her comment was picked out by Retraction Watch). Her conclusion was this.
"It’s good to see claims about dietary supplements tested. However, the results here rely on a chain of yet-to-be-validated assumptions that are still weakly supported at each point. In my opinion, the immodest title of this paper is not supported by its contents."
NHS Choices spotted most of the problems too, in "A mug of cocoa is not a cure for memory problems". And so did Ian Musgrave of the University of Adelaide who wrote "Most Disappointing Headline Ever (No, Chocolate Will Not Improve Your Memory)",
Here are some of the many problems.
- The paper was not about cocoa. Drinks containing 900 mg cocoa flavanols (as much as in about 25 chocolate bars) and 138 mg of (−)-epicatechin were compared with much lower amounts of these compounds
- The abstract, all that most people could read, said that subjects were given "high or low cocoa–containing diet for 3 months". Bit it wasn’t a test of cocoa: it was a test of a dietary "supplement".
- The sample was small (37ppeople altogether, split between four groups), and therefore under-powered for detection of the small effect that was expected (and observed)
- The authors declared the result to be "significant" but you had to hunt through the paper to discover that this meant P = 0.04 (hint -it’s 6 lines above Table 1). That means that there is around a 50% chance that it’s a false discovery.
- The test was short -only three months
- The test didn’t measure memory anyway. It measured reaction speed, They did test memory retention too, and there was no detectable improvement. This was not mentioned in the abstract, Neither was the fact that exercise had no detectable effect.
- The study was funded by the Mars bar company. They, like many others, are clearly looking for a niche in the huge "supplement" market,
The claims by the senior author, in a Columbia promotional video that the drink produced "an improvement in memory" and "an improvement in memory performance by two or three decades" seem to have a very thin basis indeed. As has the statement that "we don’t need a pharmaceutical agent" to ameliorate a natural process (aging). High doses of supplements are pharmaceutical agents.
To be fair, the senior author did say, in the Columbia press release, that "the findings need to be replicated in a larger study—which he and his team plan to do". But there is no hint of this in the paper itself, or in the title of the press release "Dietary Flavanols Reverse Age-Related Memory Decline". The time for all the publicity is surely after a well-powered study, not before it.
The high altmetrics score for this paper is yet another blow to the reputation of altmetrics.
One may well ask why Nature Neuroscience and the Columbia press office allowed such extravagant claims to be made on such a flimsy basis.
What’s going wrong?
These two papers have much in common. Elaborate imaging studies are accompanied by poor functional tests. All the hype focusses on the latter. These led me to the speculation ( In Pubmed Commons) that what actually happens is as follows.
- Authors do big imaging (fMRI) study.
- Glamour journal says coloured blobs are no longer enough and refuses to publish without functional information.
- Authors tag on a small human study.
- Paper gets published.
- Hyped up press releases issued that refer mostly to the add on.
- Journal and authors are happy.
- But science is not advanced.
It’s no wonder that Dorothy Bishop wrote "High-impact journals: where newsworthiness trumps methodology".
It’s time we forgot glamour journals. Publish open access on the web with open comments. Post-publication peer review is working
But boycott commercial publishers who charge large amounts for open access. It shouldn’t cost more than about £200, and more and more are essentially free (my latest will appear shortly in Royal Society Open Science).
Hilda Bastian has an excellent post about the dangers of reading only the abstract "Science in the Abstract: Don’t Judge a Study by its Cover"
4 November 2014
I was upbraided on Twitter by Euan Adie, founder of Almetric.com, because I didn’t click through the altmetric symbol to look at the citations "shouldn’t have to tell you to look at the underlying data David" and "you could have saved a lot of Google time". But when I did do that, all I found was a list of media reports and blogs -pretty much the same as Nature Neuroscience provides itself.
More interesting, I found that my blog wasn’t listed and neither was PubMed Commons. When I asked why, I was told "needs to regularly cite primary research. PubMed, PMC or repository links”. But this paper is behind a paywall. So I provide (possibly illegally) a copy of it, so anyone can verify my comments. The result is that altmetric’s dumb algorithms ignore it. In order to get counted you have to provide links that lead nowhere.
So here’s a link to the abstract (only) in Pubmed for the Science paper http://www.ncbi.nlm.nih.gov/pubmed/25170153 and here’s the link for the Nature Neuroscience paper http://www.ncbi.nlm.nih.gov/pubmed/25344629
It seems that altmetrics doesn’t even do the job that it claims to do very efficiently.
It worked. By later in the day, this blog was listed in both Nature‘s metrics section and by altmetrics. com. But comments on Pubmed Commons were still missing, That’s bad because it’s an excellent place for post-publications peer review.
One of my scientific heroes is Bernard Katz. The closing words of his inaugural lecture, as professor of biophysics at UCL, hang on the wall of my office as a salutory reminder to refrain from talking about ‘how the brain works’. After speaking about his discoveries about synaptic transmission, he ended thus.
"My time is up and very glad I am, because I have been leading myself right up to a domain on which I should not dare to trespass, not even in an Inaugural Lecture. This domain contains the awkward problems of mind and matter about which so much has been talked and so little can be said, and having told you of my pedestrian disposition, I hope you will give me leave to stop at this point and not to hazard any further guesses."
The question of what to eat for good health is truly a topic about "which so much has been talked and so little can be said"
That was emphasized yet again by an editorial in the Brirish Medical Journal written by my favourite epidemiologist. John Ioannidis. He has been at the forefront of debunking hype. Its title is “Implausible results in human nutrition research” (BMJ, 2013;347:f6698.
The gist is given by the memorable statement
"Almost every single nutrient imaginable has peer reviewed publications associating it with almost any outcome."
and the subtitle
“Definitive solutions won’t come from another million observational papers or small randomized trials“.
Being a bit obsessive about causality, this paper is music to my ears. It vindicates my own views, as an amateur epidemiologist, on the results of the endless surveys of diet and health.
- Diet and health. What can you believe: or does bacon kill you (2009) in which I look at the World Cancer Research Fund’s evidence for causality (next to none in my opinion). Through this I got to know Gary Taubes, whose explanation of causality in the New York Times is the best popular account I’ve ever seen.
- How big is the risk from eating red meat now: an update (2012) This was based on the WCRF update – the risk was roughly halved though it didn’t say that in the press release.
- Another update. Red meat doesn’t kill you, but the spin is fascinating (2013). Update after the EPIC results in which the risk essentially vanished: good news which you could find only by digging into Table 3.
There is nothing new about the problem. It’s been written about many times. Young & Karr (Significance, 8, 116 – 120, 2011: get pdf) said "Any claim coming from an observational study is most likely to be wrong". Out of 52 claims made in 12 observational studies, not a single one was confirmed when tested by randomised controlled trials.
Another article cited by Ioannidis, "Myths, Presumptions, and Facts about Obesity" (Casazza et al , NEJM, 2013), debunks many myths, but the list of conflicts of interests declared by the authors is truly horrendous (and at least one of their conclusions has been challenged, albeit by people with funding from Kellogg’s). The frequent conflicts of interest in nutrition research make a bad situation even worse.
The quotation in bold type continues thus.
"On 25 October 2013, PubMed listed 291 papers with the keywords “coffee OR caffeine” and 741 with “soy,” many of which referred to associations. In this literature of epidemic proportions, how many results are correct? Many findings are entirely implausible. Relative risks that suggest we can halve the burden of cancer with just a couple of servings a day of a single nutrient still circulate widely in peer reviewed journals.
However, on the basis of dozens of randomized trials, single nutrients are unlikely to have relative risks less than 0.90 for major clinical outcomes when extreme tertiles of population intake are compared—most are greater than 0.95. For overall mortality, relative risks are typically greater than 0.995, if not entirely null. The respective absolute risk differences would be trivial. Observational studies and even randomized trials of single nutrients seem hopeless, with rare exceptions. Even minimal confounding or other biases create noise that exceeds any genuine effect. Big datasets just confer spurious precision status to noise."
"According to the latest burden of disease study, 26% of deaths and 14% of disability adjusted life years in the United States are attributed to dietary risk factors, even without counting the impact of obesity. No other risk factor comes anywhere close to diet in these calculations (not even tobacco and physical inactivity). I suspect this is yet another implausible result. It builds on risk estimates from the same data of largely implausible nutritional studies discussed above. Moreover, socioeconomic factors are not considered at all, although they may be at the root of health problems. Poor diet may partly be a correlate or one of several paths through which social factors operate on health."
Another field that is notorious for producing false positives, wirh false attribution of causality, is the detection of biomarkers. A critical discussion can be found in the paper by Broadhurst & Kell (2006), "False discoveries in metabolomics and related experiments".
"Since the early days of transcriptome analysis (Golub et al., 1999), many workers have looked to detect different gene expression in cancerous versus normal tissues. Partly because of the expense of transcriptomics (and the inherent noise in such data (Schena, 2000; Tu et al., 2002; Cui and Churchill, 2003; Liang and Kelemen, 2006)), the numbers of samples and their replicates is often small while the number of candidate genes is typically in the thousands. Given the above, there is clearly a great danger that most of these will not in practice withstand scrutiny on deeper analysis (despite the ease with which one can create beautiful heat maps and any number of ‘just-so’ stories to explain the biological relevance of anything that is found in preliminary studies!). This turns out to be the case, and we review a recent analysis (Ein-Dor et al., 2006) of a variety of such studies."
The fields of metabolomics, proteomics and transcriptomics are plagued by statistical problems (as well as being saddled with ghastly pretentious names).
What’s to be done?
Barker Bausell, in his demolition of research on acupuncture, said:
[Page39] “But why should nonscientists care one iota about something as esoteric as causal inference? I believe that the answer to this question is because the making of causal inferences is part of our job description as Homo Sapiens.”
The problem, of course, is that humans are very good at attributing causality when it does not exist. That has led to confusion between correlation and cause on an industrial scale, not least in attempts to work out the effects of diet on health.
More than in any other field it is hard to do the RCTs that could, in principle, sort out the problem. It’s hard to allocate people at random to different diets, and even harder to make people stick to those diets for the many years that are needed.
We can probably say by now that no individual food carries a large risk, or affords very much protection. The fact that we are looking for quite small effects means that even when RCTs are possible huge samples will be needed to get clear answers. Most RCTs are too short, and too small (under-powered) and that leads to overestimation of the size of effects.
That’s a problem that plagues experimental pyschology too, and has led to a much-discussed crisis in reproducibility.
"Supplements" of one sort and another are ubiquitous in sports. Nobody knows whether they work, and the margin between winning and losing is so tiny that it’s very doubtful whether we ever will know. We can expect irresponsible claims to continue unabated.
The best thing that can be done in the short term is to stop doing large observational studies altogether. It’s now clear that inferences made from them are likely to be wrong. And, sad to say, we need to view with great skepticism anything that is funded by the food industry. And make a start on large RCTs whenever that is possible. Perhaps the hardest goal of all is to end the "publish or perish" culture which does so much to prevent the sort of long term experiments which would give the information we want.
Ioannidis’ article ends with the statement
"I am co-investigator in a randomized trial of a low carbohydrate versus low fat diet that is funded by the US National Institutes of Health and the non-profit Nutrition Science Initiative."
It seems he is putting his money where his mouth is.
Until we have the results, we shall continue to be bombarded with conflicting claims made by people who are doing their best with flawed methods, as well as by those trying to sell fad diets. Don’t believe them. The famous "5-a-day" advice that we are constantly bombarded with does no harm, but it has no sound basis.
As far as I can guess, the only sound advice about healthy eating for most people is
- don’t eat too much
- don’t eat all the same thing
You can’t make much money out of that advice.
No doubt that is why you don’t hear it very often.
Two relevant papers that show the unreliability of observational studies,
"Nearly 80,000 observational studies were published in the decade 1990–2000 (Naik 2012). In the following decade, the number of studies grew to more than 260,000". Madigan et al. (2014)
“. . . the majority of observational studies would declare statistical significance when no effect is present” Schuemie et al., (2012)
20 March 2014
On 20 March 2014, I gave a talk on this topic at the Cambridge Science Festival (more here). After the event my host, Yvonne Noblis, sent me some (doubtless cherry-picked) feedback she’d had about the talk.
Stolen from badscience.net
Peter Medawar, the eminent biologist, in his classic book Advice to a Young Scientist, said this.
“Exaggerated claims for the efficacy of a medicament are very seldom the consequence of any intention to deceive; they are usually the outcome of a kindly conspiracy in which everybody has the very best intentions. The patient wants to get well, his physician wants to have made him better, and the pharmaceutical company would have liked to have put it into the physician’s power to have made him so. The controlled clinical trial is an attempt to avoid being taken in by this conspiracy of good will.”
There was a lot of truth in that 1979, towards the end of the heyday of small molecule pharmacology. Since then, one can argue, things have gone downhill.
First, though, think of life without general anaesthetics, local anaesthetics, antibiotics, anticoagulants and many others. They work well and have done incalculable good. And they were developed by the drug industry.
But remember also that remarkably little is known about medicine. There are huge areas in which neither causes nor cures are known. Treatments for chronic pain, back problems, many sorts of cancer and almost all mental problems are a mess. It just isn’t known what to do. Nobody is to blame for this. Serious medical research has been going on for little more than 60 years, and it turns out to be very complicated. We are doing our best, but are still ignorant about whole huge areas. That leads to a temptation to make things up. Clutching at straws is very evident when it comes to depression, pain and Alzheimer’s disease, among others.
In order to improve matters, one essential is to do fair tests on treatments that we have. Ben Goldacre’s book is a superb account of how this could be done, and how the process of testing has been subverted for commercial gain and to satisfy the vanities of academics.
Of course there is nothing new in criticisms of Big Pharma. The huge fines levied on them for false advertising are well known. The difference is that Goldacre’s book explains clearly what’s gone wrong in great detail, documents it thoroughly, and makes concrete suggestions for improving matters.
Big Pharma has undoubtedly sometimes behaved appallingly in recent years. Someone should be in jail for crimes against patients. They have behaved in much the same way that bankers have. In any huge globalised industry it is always possible to blame someone in another department for the dishonesty. But they aren’t the only people to blame. None of the problems could have arisen with the complicity of academics, universities, and a plethora of regulatory agencies and professional bodies.
The biggest scandal of all is missing data (chapter 1). Companies, and sometmes academics, have suppressed of trials that don’t favour the drugs that they are trying to sell. The antidepressant drug, reboxetine, appeared at first to be good. It had been approved by the Medicines and Healthcare products Regulatory Agency (MHRA) and there was at least one good randomized placebo-controlled trial (RCT) showing it worked. But it didn’t. The manufacturer didn’t provide a complete list of unpublished trials when asked for them. After much work it was found in 2010 that, as well as the published, favourable trial, there were six more trials which had not been published and all six showed reboxetine to be no better than placebo . In comparisons with other antidepressant drugs three small studies (507 patients) showed reboxetine to be as good as its competitors. These were published. But it came to light that data on 1657 patients had never been published and these showed reboxetine to be worse than its rivals.
When all the data for the SSRI antidepressants were unearthed (Kirsch et al., 2008) it turned out that they were no better than placebo for mild or moderate depression. This selective suppression of negative data has happened time and time again. It harms patients and deceives doctors, but, incredibly, it’s not illegal.
Disgracefully, Kirsch et al. had to use a Freedom of Information Act request to get the data from the FDA.
“The output of a regulator is often simply a crude, brief summary: almost a ‘yes’ or ‘no’ about side effects. This is the opposite of science, which is only reliable because everyone shows their working, explains how they know that something is effective or safe, shares their methods and their results, and allows others to decide if they agree with the way they processed and analysed the data.”
“the NICE document discussing whether it’s a good idea to have Lucentis, an extremely expensive drug, costing well over £ 1,000 per treatment, that is injected into the eye for a condition called acute macular degeneration. As you can see, the NICE document on whether this treatment is a good idea is censored. Not only is the data on the effectiveness of the treatment blanked out by thick black rectangles, in case any doctor or patient should see it, but absurdly, even the names of some trials are missing, preventing the reader from even knowing of their existence, or cross referencing information about them.Most disturbing of all, as you can see in the last bullet point, the data on adverse events is also censored.”
The book lists all the tricks that are used by both industry and academics. Here are some of them.
- Regulatory agencies like the MHRA, the European Medicines Agency (EMA) and the US Food and Drugs Administration (FDA) set a low bar for approval of drugs.
- Companies make universities sign gagging agreements which allow unfavourable results to be suppressed, and their existence hidden.
- Accelerated approval schemes are abused to get quick approval of ineffective drugs and the promised proper tests often don’t materialise
- Disgracefully, even when all the results have been given to the regulatory agencies (which isn’t always). The MHRA, EMA and FDA don’t make them public. We are expected to take their word.
- Although all clinical trials are meant to be registered before they start, the EMA register, unbelievably, is not public. Furthermore there is no check that the results if trials ever get published. Despite mandates that results must be published within a year of finishing the trial, many aren’t. Journals promise to check this sort of thing, but they don’t.
- When the results are published, it is not uncommon for the primary outcome, specified before it started, to have been changed to one that looks like a more favourable result. Journals are meant to check, but mostly don’t.
- Companies use scientific conferences, phony journals, make-believe “seed trials” and “continuing medical education” for surreptitious advertising.
- Companies invent new diseases, plant papers to make you think you’re abnormal, and try to sell you a “cure”. For example, female sexual dysfunction , restless legs syndrome and social anxiety disorder (i.e. shyness). This is called disease-mongering, medicalisation or over-diagnosis. It’s bad.
- Spin is rife. Companies, and authors, want to talk up their results. University PR departments want to exaggerate benefits. Journal editors want sensational papers. Read the results, not the summary. This is universal (but particularly bad in alternative medicine).
- Companies fund patient groups to lobby for pills even when the pills are known to be ineffective. The lobby that demanded that Herceptin should be available to all on the breast cancer patients on the NHS was organised by a PR company working for the manufacturer, Roche. But Herceptin doesn’t work at all in 80% of patients and gives you at best a few extra months of life in advanced cases.
- Ghostwriting of papers is serious corruption. A company writes the paper and senior academics appear as the authors, though they may never have seen the original data. Even in cases where academics have admitted to lying about whether they have seen the data, they go unpunished by their universities. See for example, the case of Professor Eastell.
- By encouraging the funding of “continuing medical education” by companies, the great and the good of academic medicine have let us down badly.
This last point is where the book ends, and it’s worth amplification.
“So what have the great and good of British medicine done to help patients, in the face of this endemic corruption, and these systematic flaws? In 2012, a collaborative document was produced by senior figures in medicine from across the board, called ‘Guidance on Collaboration Between Healthcare Professionals and the Pharmaceutical Industry’. This document was jointly approved by the ABPI, the Department of Health, the Royal Colleges of Physicians, Nursing, Psychiatrists, GPs, the Lancet, the British Medical Association, the NHS Confederation, and so on. ”
“It contains no recognition of the serious problems we have seen in this book. In fact, quite the opposite: it makes a series of assertions about them that are factually incorrect.”
“It states that drug reps ‘can be a useful resource for healthcare professionals’. Again, I’m not sure why the Royal Colleges, the BMA, the Department of Health and the NHS Confederation felt the need to reassert this to the doctors of the UK, on behalf of industry, when the evidence shows that drug reps actively distort prescribing practices. But that is the battle you face, trying to get these issues taken seriously by the pinnacle of the medical establishment.”
This is perhaps the most shameful betrayal of all. The organisations that should protect patients have sold them out.
You may have been sold out by your “elders and betters”, but you can do something. The “What to do” sections of the book should be produced as a set of flash cards, as a reminder that matters can be improved.
It is shameful that this book was not written by a clinical pharmacologist, or a senior doctor, or a Royal College, or a senior academic. Why has the British Pharmacological Society said nothing?
It is shameful too that this book was not written by one of the quacks who are keen to defend the $60 billion alternative medicine industry (which has cured virtually nothing) and who are strident in their criticism of the 600 billion dollar Pharma industry. They haven’t done the work that Goldacre has to analyse the real problems. All they have done is to advocate unfair tests, because that is the only sort their treatments can pass.
It’s weird that medicine, the most caring profession, is more corrupt than any other branch of science. The reason, needless to say, is money. Well, money and vanity. The publish or perish mentality of senior academics encourages dishonesty. It is a threat to honest science.
Goldacre’s book shows the consequences: harm to patients and huge wastage of public money.
7 October, 2012, The Observer
"I think it’s really disappointing that nobody, not the Royal Colleges, the Academy of Medical Sciences, the British Pharmacological Society, the British Medical Association, none of these organisations have stood up and said: selective non-publication of unflattering trial data is research misconduct, and if you do it you will be booted out. And I think they really urgently should."
I have in the past, taken an occasional interest in the philosophy of science. But in a lifetime doing science, I have hardly ever heard a scientist mention the subject. It is, on the whole, a subject that is of interest only to philosophers.
It’s true that some philosophers have had interesting things to say about the nature of inductive inference, but during the 20th century the real advances in that area came from statisticians, not from philosophers. So I long since decided that it would be more profitable to spend my time trying to understand R.A Fisher, rather than read even Karl Popper. It is harder work to do that, but it seemed the way to go.
This post is based on the last part of chapter titled In "Praise of Randomisation" . A talk was given at the meeting at the British Academy in December 2007, and the book will be launched on November 28th 2011 (good job it wasn’t essential for my CV with delays like that). The book is published by OUP for the British Academy, under the title Evidence, Inference and Enquiry (edited by Philip Dawid, William Twining, and Mimi Vasilaki, 504 pages, £85.00). The bulk of my contribution has already appeared here, in May 2009, under the heading Diet and health. What can you believe: or does bacon kill you?. It is one of the posts that has given me the most satisfaction, if only because Ben Goldacre seemed to like it, and he has done more than anyone to explain the critical importance of randomisation for assessing treatments and for assessing social interventions.
Having long since decided that it was Fisher, rather than philosophers, who had the answers to my questions, why bother to write about philosophers at all? It was precipitated by joining the London Evidence Group. Through that group I became aware that there is a group of philosophers of science who could, if anyone took any notice of them, do real harm to research. It seems surprising that the value of randomisation should still be disputed at this stage, and of course it is not disputed by anybody in the business. It was thoroughly established after the start of small sample statistics at the beginning of the 20th century. Fisher’s work on randomisation and the likelihood principle put inference on a firm footing by the mid-1930s. His popular book, The Design of Experiments made the importance of randomisation clear to a wide audience, partly via his famous example of the lady tasting tea. The development of randomisation tests made it transparently clear (perhaps I should do a blog post on their beauty). By the 1950s. the message got through to medicine, in large part through Austin Bradford Hill.
Despite this, there is a body of philosophers who dispute it. And of course it is disputed by almost all practitioners of alternative medicine (because their treatments usually fail the tests). Here are some examples.
“don’t believe the bad press that ‘observational studies’ or ‘historically controlled trials’ get – so long as they are properly done (that is, serious thought has gone in to the possibility of alternative explanations of the outcome), then there is no reason to think of them as any less compelling than an RCT.”
In my view this conclusion is seriously, and dangerously, wrong –it ignores the enormous difficulty of getting evidence for causality in real life, and it ignores the fact that historically controlled trials have very often given misleading results in the past, as illustrated by the diet problem.. Worrall’s fellow philosopher, Nancy Cartwright (Are RCTs the Gold Standard?, 2007), has made arguments that in some ways resemble those of Worrall.
Many words are spent on defining causality but, at least in the clinical setting the meaning is perfectly simple. If the association between eating bacon and colorectal cancer is causal then if you stop eating bacon you’ll reduce the risk of cancer. If the relationship is not causal then if you stop eating bacon it won’t help at all. No amount of Worrall’s “serious thought” will substitute for the real evidence for causality that can come only from an RCT: Worrall seems to claim that sufficient brain power can fill in missing bits of information. It can’t. I’m reminded inexorably of the definition of “Clinical experience. Making the same mistakes with increasing confidence over an impressive number of years.” In Michael O’Donnell’s A Sceptic’s Medical Dictionary.
At the other philosophical extreme, there are still a few remnants of post-modernist rhetoric to be found in obscure corners of the literature. Two extreme examples are the papers by Holmes et al. and by Christine Barry. Apart from the fact that they weren’t spoofs, both of these papers bear a close resemblance to Alan Sokal’s famous spoof paper, Transgressing the boundaries: towards a transformative hermeneutics of quantum gravity (Sokal, 1996). The acceptance of this spoof by a journal, Social Text, and the subsequent book, Intellectual Impostures, by Sokal & Bricmont (Sokal & Bricmont, 1998), exposed the astonishing intellectual fraud if postmodernism (for those for whom it was not already obvious). A couple of quotations will serve to give a taste of the amazing material that can appear in peer-reviewed journals. Barry (2006) wrote
“I wish to problematise the call from within biomedicine for more evidence of alternative medicine’s effectiveness via the medium of the randomised clinical trial (RCT).”
“Ethnographic research in alternative medicine is coming to be used politically as a challenge to the hegemony of a scientific biomedical construction of evidence.”
“The science of biomedicine was perceived as old fashioned and rejected in favour of the quantum and chaos theories of modern physics.”
“In this paper, I have deconstructed the powerful notion of evidence within biomedicine, . . .”
The aim of this paper, in my view, is not obtain some subtle insight into the process of inference but to try to give some credibility to snake-oil salesmen who peddle quack cures. The latter at least make their unjustified claims in plain English.
The similar paper by Holmes, Murray, Perron & Rail (Holmes et al., 2006) is even more bizarre.
“Objective The philosophical work of Deleuze and Guattari proves to be useful in showing how health sciences are colonised (territorialised) by an all-encompassing scientific research paradigm “that of post-positivism ” but also and foremost in showing the process by which a dominant ideology comes to exclude alternative forms of knowledge, therefore acting as a fascist structure. “,
It uses the word fascism, or some derivative thereof, 26 times. And Holmes, Perron & Rail (Murray et al., 2007)) end a similar tirade with
“We shall continue to transgress the diktats of State Science.”
It may be asked why it is even worth spending time on these remnants of the utterly discredited postmodernist movement. One reason is that rather less extreme examples of similar thinking still exist in some philosophical circles.
Take, for example, the views expressed papers such as Miles, Polychronis and Grey (2006), Miles & Loughlin (2006), Miles, Loughlin & Polychronis (Miles et al., 2007) and Loughlin (2007).. These papers form part of the authors’ campaign against evidence-based medicine, which they seem to regard as some sort of ideological crusade, or government conspiracy. Bizarrely they seem to think that evidence-based medicine has something in common with the managerial culture that has been the bane of not only medicine but of almost every occupation (and which is noted particularly for its disregard for evidence). Although couched in the sort of pretentious language favoured by postmodernists, in fact it ends up defending the most simple-minded forms of quackery. Unlike Barry (2006), they don’t mention alternative medicine explicitly, but the agenda is clear from their attacks on Ben Goldacre. For example, Miles, Loughlin & Polychronis (Miles et al., 2007) say this.
“Loughlin identifies Goldacre  as a particularly luminous example of a commentator who is able not only to combine audacity with outrage, but who in a very real way succeeds in manufacturing a sense of having been personally offended by the article in question. Such moralistic posturing acts as a defence mechanism to protect cherished assumptions from rational scrutiny and indeed to enable adherents to appropriate the ‘moral high ground’, as well as the language of ‘reason’ and ‘science’ as the exclusive property of their own favoured approaches. Loughlin brings out the Orwellian nature of this manoeuvre and identifies a significant implication.”
If Goldacre and others really are engaged in posturing then their primary offence, at least according to the Sartrean perspective adopted by Murray et al. is not primarily intellectual, but rather it is moral. Far from there being a moral requirement to ‘bend a knee’ at the EBM altar, to do so is to violate one’s primary duty as an autonomous being.”
This ferocious attack seems to have been triggered because Goldacre has explained in simple words what constitutes evidence and what doesn’t. He has explained in a simple way how to do a proper randomised controlled trial of homeopathy. And he he dismantled a fraudulent Qlink pendant, purported to shield you from electromagnetic radiation but which turned out to have no functional components (Goldacre, 2007). This is described as being “Orwellian”, a description that seems to me to be downright bizarre.
In fact, when faced with real-life examples of what happens when you ignore evidence, those who write theoretical papers that are critical about evidence-based medicine may behave perfectly sensibly. Although Andrew Miles edits a journal, (Journal of Evaluation in Clinical Practice), that has been critical of EBM for years. Yet when faced with a course in alternative medicine run by people who can only be described as quacks, he rapidly shut down the course (A full account has appeared on this blog).
It is hard to decide whether the language used in these papers is Marxist or neoconservative libertarian. Whatever it is, it clearly isn’t science. It may seem odd that postmodernists (who believe nothing) end up as allies of quacks (who’ll believe anything). The relationship has been explained with customary clarity by Alan Sokal, in his essay Pseudoscience and Postmodernism: Antagonists or Fellow-Travelers? (Sokal, 2006).
Of course RCTs are not the only way to get knowledge. Often they have not been done, and sometimes it is hard to imagine how they could be done (though not nearly as often as some people would like to say).
It is true that RCTs tell you only about an average effect in a large population. But the same is true of observational epidemiology. That limitation is nothing to do with randomisation, it is a result of the crude and inadequate way in which diseases are classified (as discussed above). It is also true that randomisation doesn’t guarantee lack of bias in an individual case, but only in the long run. But it is the best that can be done. The fact remains that randomization is the only way to be sure of causality, and making mistakes about causality can harm patients, as it did in the case of HRT.
Raymond Tallis (1999), in his review of Sokal & Bricmont, summed it up nicely
“Academics intending to continue as postmodern theorists in the interdisciplinary humanities after S & B should first read Intellectual Impostures and ask themselves whether adding to the quantity of confusion and untruth in the world is a good use of the gift of life or an ethical way to earn a living. After S & B, they may feel less comfortable with the glamorous life that can be forged in the wake of the founding charlatans of postmodern Theory. Alternatively, they might follow my friend Roger into estate agency — though they should check out in advance that they are up to the moral rigours of such a profession.”
The conclusions that I have drawn were obvious to people in the business a half a century ago. (Doll & Peto, 1980) said
"If we are to recognize those important yet moderate real advances in therapy which can save thousands of lives, then we need more large randomised trials than at present, not fewer. Until we have them treatment of future patients will continue to be determined by unreliable evidence."
The towering figures are R.A. Fisher, and his followers who developed the ideas of randomisation and maximum likelihood estimation. In the medical area, Bradford Hill, Archie Cochrane, Iain Chalmers had the important ideas worked out a long time ago.
In contrast, philosophers like Worral, Cartwright, Holmes, Barry, Loughlin and Polychronis seem to me to make no contribution to the accumulation of useful knowledge, and in some cases to hinder it. It’s true that the harm they do is limited, but that is because they talk largely to each other. Very few working scientists are even aware of their existence. Perhaps that is just as well.
Cartwright N (2007). Are RCTs the Gold Standard? Biosocieties (2007), 2: 11-20
Colquhoun, D (2010) University of Buckingham does the right thing. The Faculty of Integrated Medicine has been fired. http://www.dcscience.net/?p=2881
Miles A & Loughlin M (2006). Continuing the evidence-based health care debate in 2006. The progress and price of EBM. J Eval Clin Pract 12, 385-398.
Miles A, Loughlin M, & Polychronis A (2007). Medicine and evidence: knowledge and action in clinical practice. J Eval Clin Pract 13, 481-503.
Miles A, Polychronis A, & Grey JE (2006). The evidence-based health care debate – 2006. Where are we now? J Eval Clin Pract 12, 239-247.
Murray SJ, Holmes D, Perron A, & Rail G (2007).
Deconstructing the evidence-based discourse in health sciences: truth, power and fascis. Int J Evid Based Healthc 2006; : 4, 180–186.
Sokal AD (1996). Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity. Social Text 46/47, Science Wars, 217-252.
Sokal AD (2006). Pseudoscience and Postmodernism: Antagonists or Fellow-Travelers? In Archaeological Fantasies, ed. Fagan GG, Routledge,an imprint of Taylor & Francis Books Ltd.
Sokal AD & Bricmont J (1998). Intellectual Impostures, New edition, 2003, Economist Books ed. Profile Books.
Tallis R. (1999) Sokal and Bricmont: Is this the beginning of the end of the dark ages in the humanities?
Worrall J. (2004) Why There’s No Cause to Randomize. Causality: Metaphysics and Methods.Technical Report 24/04 . 2004.
Worrall J (2010). Evidence: philosophy of science meets medicine. J Eval Clin Pract 16, 356-362.
Iain Chalmers has drawn my attention to a some really interesting papers in the James Lind Library
An account of early trials is given by Chalmers I, Dukan E, Podolsky S, Davey Smith G (2011). The adoption of unbiased treatment allocation schedules in clinical trials during the 19th and early 20th centuries. Fisher was not the first person to propose randomised trials, but he is the person who put it on a sound mathematical basis.
Another fascinating paper is Chalmers I (2010). Why the 1948 MRC trial of streptomoutycin used treatment allocation based on random numbers.
The distinguished statistician, David Cox contributed, Cox DR (2009). Randomization for concealment.
Incidentally, if anyone still thinks there are ethical objections to random allocation, they should read the account of retrolental fibroplasia outbreak in the 1950s, Silverman WA (2003). Personal reflections on lessons learned from randomized trials involving newborn infants, 1951 to 1967.
Chalmers also pointed out that Antony Eagle of Exeter College Oxford had written about Goldacre’s epistemology. He describes himself as a "formal epistemologist". I fear that his criticisms seem to me to be carping and trivial. Once again, a philosopher has failed to make a contribution to the progress of knowledge.
One wonders about the standards of peer review at the British Journal of General Practice. The June issue has a paper, "Acupuncture for ‘frequent attenders’ with medically unexplained symptoms: a randomised controlled trial (CACTUS study)". It has lots of numbers, but the result is very easy to see. Just look at their Figure.
There is no need to wade through all the statistics; it’s perfectly obvious at a glance that acupuncture has at best a tiny and erratic effect on any of the outcomes that were measured.
But this is not what the paper said. On the contrary, the conclusions of the paper said
The addition of 12 sessions of five-element acupuncture to usual care resulted in improved health status and wellbeing that was sustained for 12 months.
How on earth did the authors manage to reach a conclusion like that?
The first thing to note is that many of the authors are people who make their living largely from sticking needles in people, or advocating alternative medicine. The authors are Charlotte Paterson, Rod S Taylor, Peter Griffiths, Nicky Britten, Sue Rugg, Jackie Bridges, Bruce McCallum and Gerad Kite, on behalf of the CACTUS study team. The senior author, Gerad Kite MAc , is principal of the London Institute of Five-Element Acupuncture London. The first author, Charlotte Paterson, is a well known advocate of acupuncture. as is Nicky Britten.
The conflicts of interest are obvious, but nonetheless one should welcome a “randomised controlled trial” done by advocates of alternative medicine. In fact the results shown in the Figure are both interesting and useful. They show that acupuncture does not even produce any substantial placebo effect. It’s the authors’ conclusions that are bizarre and partisan. Peer review is indeed a broken process.
That’s really all that needs to be said, but for nerds, here are some more details.
How was the trial done?
The description "randomised" is fair enough, but there were no proper controls and the trial was not blinded. It was what has come to be called a "pragmatic" trial, which means a trial done without proper controls. They are, of course, much loved by alternative therapists because their therapies usually fail in proper trials. It’s much easier to get an (apparently) positive result if you omit the controls. But the fascinating thing about this study is that, despite the deficiencies in design, the result is essentially negative.
The authors themselves spell out the problems.
“Group allocation was known by trial researchers, practitioners, and patients”
So everybody (apart from the statistician) knew what treatment a patient was getting. This is an arrangement that is guaranteed to maximise bias and placebo effects.
"Patients were randomised on a 1:1 basis to receive 12 sessions of acupuncture starting immediately (acupuncture group) or starting in 6 months’ time (control group), with both groups continuing to receive usual care."
So it is impossible to compare acupuncture and control groups at 12 months, contrary to what’s stated in Conclusions.
"Twelve sessions, on average 60 minutes in length, were provided over a 6-month period at approximately weekly, then fortnightly and monthly intervals"
That sounds like a pretty expensive way of getting next to no effect.
"All aspects of treatment, including discussion and advice, were individualised as per normal five-element acupuncture practice. In this approach, the acupuncturist takes an in-depth account of the patient’s current symptoms and medical history, as well as general health and lifestyle issues. The patient’s condition is explained in terms of an imbalance in one of the five elements, which then causes an imbalance in the whole person. Based on this elemental diagnosis, appropriate points are used to rebalance this element and address not only the presenting conditions, but the person as a whole".
Does this mean that the patients were told a lot of mumbo jumbo about “five elements” (fire earth, metal, water, wood)? If so, anyone with any sense would probably have run a mile from the trial.
"Hypotheses directed at the effect of the needling component of acupuncture consultations require sham-acupuncture controls which while appropriate for formulaic needling for single well-defined conditions, have been shown to be problematic when dealing with multiple or complex conditions, because they interfere with the participative patient–therapist interaction on which the individualised treatment plan is developed. 37–39 Pragmatic trials, on the other hand, are appropriate for testing hypotheses that are directed at the effect of the complex intervention as a whole, while providing no information about the relative effect of different components."
Put simply that means: we don’t use sham acupuncture controls so we can’t distinguish an effect of the needles from placebo effects, or get-better-anyway effects.
"Strengths and limitations: The ‘black box’ study design precludes assigning the benefits of this complex intervention to any one component of the acupuncture consultations, such as the needling or the amount of time spent with a healthcare professional."
"This design was chosen because, without a promise of accessing the acupuncture treatment, major practical and ethical problems with recruitment and retention of participants were anticipated. This is because these patients have very poor self-reported health (Table 3), have not been helped by conventional treatment, and are particularly desperate for alternative treatment options.".
It’s interesting that the patients were “desperate for alternative treatment”. Again it seems that every opportunity has been given to maximise non-specific placebo, and get-well-anyway effects.
There is a lot of statistical analysis and, unsurprisingly, many of the differences don’t reach statistical significance. Some do (just) but that is really quite irrelevant. Even if some of the differences are real (not a result of random variability), a glance at the figures shows that their size is trivial.
(1) This paper, though designed to be susceptible to almost every form of bias, shows staggeringly small effects. It is the best evidence I’ve ever seen that not only are needles ineffective, but that placebo effects, if they are there at all, are trivial in size and have no useful benefit to the patient in this case..
(2) The fact that this paper was published with conclusions that appear to contradict directly what the data show, is as good an illustration as any I’ve seen that peer review is utterly ineffective as a method of guaranteeing quality. Of course the editor should have spotted this. It appears that quality control failed on all fronts.
In the first four days of this post, it got over 10,000 hits (almost 6,000 unique visitors).
Margaret McCartney has written about this too, in The British Journal of General Practice does acupuncture badly.
The Daily Mail exceeds itself in an article by Jenny Hope whch says “Millions of patients with ‘unexplained symptoms’ could benefit from acupuncture on the NHS, it is claimed”. I presume she didn’t read the paper.
The Daily Telegraph scarcely did better in Acupuncture has significant impact on mystery illnesses. The author if this, very sensibly, remains anonymous.
Many “medical information” sites churn out the press release without engaging the brain, but most of the other newspapers appear, very sensibly, to have ignored ther hyped up press release. Among the worst was Pulse, an online magazine for GPs. At least they’ve publish the comments that show their report was nonsense.
The Daily Mash has given this paper a well-deserved spoofing in Made-up medicine works on made-up illnesses.
“Professor Henry Brubaker, of the Institute for Studies, said: “To truly assess the efficacy of acupuncture a widespread double-blind test needs to be conducted over a series of years but to be honest it’s the equivalent of mapping the DNA of pixies or conducting a geological study of Narnia.” ”
There is no truth whatsoever in the rumour being spread on Twitter that I’m Professor Brubaker.
Euan Lawson, also known as Northern Doctor, has done another excellent job on the Paterson paper: BJGP and acupuncture – tabloid medical journalism. Most tellingly, he reproduces the press release from the editor of the BJGP, Professor Roger Jones DM, FRCP, FRCGP, FMedSci.
"Although there are countless reports of the benefits of acupuncture for a range of medical problems, there have been very few well-conducted, randomised controlled trials. Charlotte Paterson’s work considerably strengthens the evidence base for using acupuncture to help patients who are troubled by symptoms that we find difficult both to diagnose and to treat."
Oooh dear. The journal may have a new look, but it would be better if the editor read the papers before writing press releases. Tabloid journalism seems an appropriate description.
Andy Lewis at Quackometer, has written about this paper too, and put it into historical context. In Of the Imagination, as a Cause and as a Cure of Disorders of the Body. “In 1800, John Haygarth warned doctors how we may succumb to belief in illusory cures. Some modern doctors have still not learnt that lesson”. It’s sad that, in 2011, a medical journal should fall into a trap that was pointed out so clearly in 1800. He also points out the disgracefully inaccurate Press release issued by the Peninsula medical school.
@david_colquhoun David Colquhoun
Appalling paper in Brit J Gen Practice: Acupuncturists show that acupuncture doesn’t work, but conclude the opposite http://bit.ly/mgIQ6e
Retweeted by gentley1300 and 36 others
@david_colquhoun David Colquhoun,
HEHE RT @brunopichler: http://tinyurl.com/3hmvan4 Made-up medicine works on made-up illnesses
@psweetman Pauline Sweetman
Read @david_colquhoun’s take on the recent ‘acupuncture effective for unexplained symptoms’ nonsense: bit.ly/mgIQ6e
@bodyinmind Body In Mind
RT @david_colquhoun: ‘Margaret McCartney (GP) also blogged acupuncture nonsense http://bit.ly/j6yP4j My take http://bit.ly/mgIQ6e’
Br J Gen Practice mete a pata na poça: RT @david_colquhoun […] appalling acupuncture nonsense http://bit.ly/j6yP4j http://bit.ly/mgIQ6e
@jodiemadden Jodie Madden
amusing!RT @david_colquhoun: paper in Brit J Gen Practice shows that acupuncture doesn’t work,but conclude the opposite http://bit.ly/mgIQ6e
@kashfarooq Kash Farooq
Unbelievable: acupuncturists show that acupuncture doesn’t work, but conclude the opposite. http://j.mp/ilUALC by @david_colquhoun
@NeilOConnell Neil O’Connell
Gobsmacking spin RT @david_colquhoun: Acupuncturists show that acupuncture doesn’t work, but conclude the opposite http://bit.ly/mgIQ6e
@euan_lawson Euan Lawson (aka Northern Doctor)
Aye too right RT @david_colquhoun @iansample @BenGoldacre Guardian should cover dreadful acupuncture paper http://bit.ly/mgIQ6e
@noahWG Noah Gray
Acupuncturists show that acupuncture doesn’t work, but conclude the opposite, from @david_colquhoun: http://bit.ly/l9KHLv
8 June 2011 I drew the attention of the editor of BJGP to the many comments that have been made on this paper. He assured me that the matter would be discussed at a meeting of the editorial board of the journal. Tonight he sent me the result of this meeting.
Dear Prof Colquhoun
We discussed your emails at yesterday’s meeting of the BJGP Editorial Board, attended by 12 Board members and the Deputy Editor
The Board was unanimous in its support for the integrity of the Journal’s peer review process for the Paterson et al paper – which was accepted after revisions were made in response to two separate rounds of comments from two reviewers and myself – and could find no reason either to retract the paper or to release the reviewers’ comments
Some Board members thought that the results were presented in an overly positive way; because the study raises questions about research methodology and the interpretation of data in pragmatic trials attempting to measure the effects of complex interventions, we will be commissioning a Debate and Analysis article on the topic.
In the meantime we would encourage you to contribute to this debate throught the usual Journal channels
Professor Roger Jones MA DM FRCP FRCGP FMedSci FHEA FRSA
It is one thing to make a mistake, It is quite another thing to refuse to admit it. This reply seems to me to be quite disgraceful.
20 July 2011. The proper version of the story got wider publicity when Margaret McCartney wrote about it in the BMJ. The first rapid response to this article was a lengthy denial by the authors of the obvious conclusion to be drawn from the paper. They merely dig themselves deeper into a hole. The second response was much shorter (and more accurate).
Thank you Dr McCartney
Richard Watson, General Practitioner
The fact that none of the authors of the paper or the editor of BJGP have bothered to try and defend themselves speaks volumes.
Like many people I glanced at the report before throwing it away with an incredulous guffaw. You bothered to look into it and refute it – in a real journal. That last comment shows part of the problem with them publishing, and promoting, such drivel. It makes you wonder whether anything they publish is any good, and that should be a worry for all GPs.
30 July 2011. The British Journal of General Practice has published nine letters that object to this study. Some of them concentrate on problems with the methods. others point out what I believe to be the main point, there us essentially no effect there to be explained. In the public interest, I am posting the responses here [download pdf file]
Thers is also a response from the editor and from the authors. Both are unapologetic. It seems that the editor sees nothing wrong with the peer review process.
I don’t recall ever having come across such incompetence in a journal’s editorial process.
Here’s all he has to say.
The BJGP Editorial Board considered this correspondence recently. The Board endorsed the Journal’s peer review process and did not consider that there was a case for retraction of the paper or for releasing the peer reviews. The Board did, however, think that the results of the study were highlighted by the Journal in an overly-positive manner. However,many of the criticisms published above are addressed by the authors themselves in the full paper.
If you subscribe to the views of Paterson et al, you may want to buy a T-shirt that has a revised version of the periodic table.
5 August 2011. A meeting with the editor of BJGP
Yesterday I met a member of the editorial board of BJGP. We agreed that the data are fine and should not be retracted. It’s the conclusions that should be retracted. I was also told that the referees’ reports were "bland". In the circumstances that merely confirmed my feeling that the referees failed to do a good job.
Today I met the editor, Roger Jones, himself. He was clearly upset by my comment and I have now changed it to refer to the whole editorial process rather than to him personally. I was told, much to my surprise, that the referees were not acupuncturists but “statisticians”. That I find baffling. It soon became clear that my differences with Professor Jones turned on interpretations of statistics.
It’s true that there were a few comparisons that got below P = 0.05, but the smallest was P = 0.02. The warning signs are there in the Methods section: "all statistical tests were …. deemed to be statistically significant if P < 0.05". This is simply silly -perhaps they should have read Lectures on Biostatistics. Or for a more recent exposition, the XKCD cartoon in which it’s proved that green jelly beans are linked to acne (P = 0.05). They make lots of comparisons but make no allowance for this in the statistics. Figure 2 alone contains 15 different comparisons: it’s not surprising that a few come out "significant", even if you don’t take into account the likelihood of systematic (non-random) errors when comparing final values with baseline values.
Keen though I am on statistics, this is a case where I prefer the eyeball test. It’s so obvious from the Figure that there’s nothing worth talking about happening, it’s a waste of time and money to torture the numbers to get "significant" differences. You have to be a slavish believer in P values to treat a result like that as anything but mildly suggestive. A glance at the Figure shows the effects, if there are any at all, are trivial.
I still maintain that the results don’t come within a million miles of justifying the authors’ stated conclusion “The addition of 12 sessions of five-element acupuncture to usual care resulted in improved health status and wellbeing that was sustained for 12 months.” Therefore I still believe that a proper course would have been to issue a new and more accurate press release. A brief admission that the interpretation was “overly-positive”, in a journal that the public can’t see, simply isn’t enough.
I can’t understand either, why the editorial board did not insist on this being done. If they had done so, it would have been temporarily embarrassing, certainly, but people make mistakes, and it would have blown over. By not making a proper correction to the public, the episode has become a cause célèbre and the reputation oif the journal will suffer permanent harm. This paper is going to be cited for a long time, and not for the reasons the journal would wish.
Misinformation, like that sent to the press, has serious real-life consequences. You can be sure that the paper as it still stands, will be cited by every acupuncturist who’s trying to persuade the Department of Health that he’s a "qualified provider".
There was not much unanimity in the discussion up to this point, Things got better when we talked about what a GP should do when there are no effective options. Roger Jones seemed to think it was acceptable to refer them to an alternative practitioner if that patient wanted it. I maintained that it’s unethical to explain to a patient how medicine works in terms of pre-scientific myths.
I’d have love to have heard the "informed consent" during which "The patient’s condition is explained in terms of imbalance in the five elements which then causes an imbalance in the whole person". If anyone had tried to explain my conditions in terms of my imbalance in my Wood, Water, Fire, Earth and Metal. I’d think they were nuts. The last author. Gerad Kite, runs a private clinic that sells acupuncture for all manner of conditions. You can find his view of science on his web site. It’s condescending and insulting to talk to patients in these terms. It’s the ultimate sort of paternalism. And paternalism is something that’s supposed to be vanishing in medicine. I maintained that this was ethically unacceptable, and that led to a more amicable discussion about the possibility of more honest placebos.
It was good of the editor to meet me in the circumstances. I don’t cast doubt on the honesty of his opinions. I simply disagree with them, both at the statistical level and the ethical level.
30 March 2014
I only just noticed that one of the authors of the paper, Bruce McCallum (who worked as an acupuncturist at Kite’s clinic) appeared in a 2007 Channel 4 News piece. I was a report on the pressure to save money by stopping NHS funding for “unproven and disproved treatments”. McCallum said that scientific evidence was needed to show that acupuncture really worked. Clearly he failed, but to admit that would have affected his income.
Watch the video (McCallum appears near the end).
I’m bored stiff with that barmiest of all the widespread forms of alternative medicine, homeopathy. It is clearly heading back to where it was in 1960, a small lunatic fringe of medicine. Nevertheless it’s worth looking at a recent development.
A paper has appeared by that arch defender of all things alternative, George Lewith.
The paper in question is “Homeopathy has clinical benefits in rheumatoid arthritis patients that are attributable to the
consultation process but not the homeopathic remedy: a randomized controlled clinical trial”, Sarah Brien, Laurie Lachance, Phil Prescott, Clare McDermott and George Lewith. [read it here]. It was published in Rheumatology.
Conclusion. Homeopathic consultations but not homeopathic remedies are associated with clinically relevant benefits for patients with active but relatively stable RA.
So yet another case where the homeopathic pills turn out the same as placebos, Hardly surprising since the pills are the same as the placebos, but it’s always good to hear it from someone whose private practice sells homeopathy for money.
The conclusion isn’t actually very novel, because Fisher & Scott (2001) had already found nine years ago that homeopathy was ineffective in reducing the symptoms if joint inflammation in RA. That is Peter Fisher, the Queens’ homeopathic physician, and Clinical Director of the Royal Hospital for Integrated Medicine (recently renamed to remove ‘homeopathy’ from its title). That paper ends with the remarkable statement [download the paper]
- "Over these years we have come to believe that conventional RCTs [randomised controlled trials] are unlikely to capture the possible benefits of homeopathy . . . . It seems more important to define if homeopathists can genuinely control patients’ symptoms and less relevant to have concerns about whether this is due to a ‘genuine’ effect or to influencing the placebo response."
That seemed to me at the time to amount to an admission that it was all ‘placebo effect’, though Fisher continues to deny that this is the case.
"Homeopathy has clinical benefits in rheumatoid arthritis patients" -the title says. But does it?
In fact this is mere spin. What the paper actually shows is that an empathetic consultation has some benefit (and even this is inconsistent). This is hardly surprising, but there is really not the slightest reason to suppose that the benefit, such as it is, has anything whatsoever to do with homeopathy.
Homeopathy, non-specific effects and good medicine is the title of an excellent editorial, in the same issue of Rheumatology, by Edzard Ernst. He points out that "The recognition of the therapeutic value of an empathetic consultation is by no means a new insight". Any therapy that provides only non-specific effects is unacceptable. Any good doctor provides that and, when it exists, real effective treatments too.
Lewith’s private clinic
The Centre for Complementary and Integrated Medicine is run by Drs Nick Avery and George Lewith. It is always a bit galling to real scientists, who often work 60 hours a week or more to get results, that people like Lewith get a professorial salary (in his case from the University of Southampton) but still have time to make more money by doing another job at the same time.
Avery is a homeopath. I wonder whether we can now look forward to the web site being changed in the near future so that there is a clear statement that the pills have no effect?
There is, at least, now no longer any mention of the Vega test on Lewith’s site. That is a test for food allergy that has been shown again and again to be fraudulent. The Environmental medicine page is brief, and avoids making any claims at all. It now contains the somewhat coy statement
“Specific food avoidance regimes are a controversial area and one in which there may be conflict between conventionally trained allergists and CAM practitioners.”
The front page of their web site boasts that "Dr George Lewith is now one of The Lifestyle 50!". " The Times, in an article on September 6th 2008, included George Lewith in The Lifestyle 50, this newspaper’s listing of the “top 50 people who influence the way we eat, exercise and think about ourselves”. Dr Lewith is included in the Alternatives category". It doesn’t mention that this is an honour he shares with such medical luminaries as Gillian ("I’m not a doctor") McKeith, Jennifer Ariston and the Pope,
But let’s end this on a happier note. There is one thing that I agree with wholeheartedly. Lewith says
"The use of bottled water seems to me to be a multi-billion pound industry, based on some of the cleverest marketing that I have ever encountered. There is absolutely no evidence that bottled water is any safer, better, or more “energising” than the water you get from the tap."
No connection of course with the multi-million pound industry of selling homeopathic water by clever marketing.
Some limitations of the paper by Brien et al.
Like any good trial, this one defined in advance a primary and secondary outcome.
The primary outcome was ACR20. which means the propertion of patients that showed an improvement of at least 20% of the number of tender and swollen joint counts and 20% improvement in 3 of the 5 remaining ACR core set measures (see Felsen 1995). Although it isn’t stressed in the paper, there was no detectable difference between consultation vs no consultation for this primary outcome.
The secondary outcome was 28-joint DAS (DAS-28), tender and swollen joint count, disease severity, pain, weekly patient
and physician GA and pain, and inflammatory markers (see, for example, Stucki. 1996). It was only on this outcome that an effect was seen between consultation and no consultation. The "effect size" (standardized mean score differences, adjusted for baseline differences) was an improvement of 0.7 in DAS-28 score, which runs on a scale from 0 – 10. Although this improvement is probably real (statistically significant), it is barely bigger than improvement of 0.6 which is said to be the smallest change that is clinically significant (Stucki. 1996).
Not only is the improvement by the consultation small in clinical terms. It is also rather inconsistent. for example Table 6 shows that the consultation seemded to result in a detectable effect on swollen joint count, but not on tender joint count. Neither was there any significant effect of the consultation on the response to “considering all the ways your arthritis affects you, please make a vertical line to show how well you are now”. There appeared to be an improvement on “negative mood score”, but not on “positive mood score”. Effects of the consultation on pain scores was marginal at best.
It seems to me that the conclusion that the consultation process helps patients, though not entirely implausible, gets marginal support from this study. It may be real, but if so it isn’t a large effect.
Like most alternative medicine advocates, the authors of this paper make the mistake of confusing caring and curing. Caring is good if there is nothing else that can be done (as is only too often the case). But what patients really want is cures and they’ll never get that from an empathetic consultation.
The problem of Human Resources
What does all this mean for alternative medicine on the NHS? Nobody denies the desirability of empathy. In fact it is now talked about so much that there is a danger that scientific medical education will be marginalised. My own experience of the NHS is that most doctors are quite good at empathy, without any need to resort to hocus pocus like homeopathy and all the myriad forms of mythical medicine.
It must be said that Drs Avery and Lewith have had proper medical training. Their views on alternative medicine seem bizarre to me, but at least they should do no great harm. Sadly, the same can’t be said for the majority of homeopaths who have no medical training and who continue to andanger the public by recommending sugar pills for anything from malaria to Dengue fever. People like that have no place in the NHS. Indeed some are in jail.
Not long ago, I was invited to tour the oncology wards at UCL hospital with their chief spiritual healer, Angie Buxton-King. Although in her private practice she offers some pretty bizarre services like healing humans and animals at a distance, I had the impression that on the wards she did a pretty good job holding hands with people who were nervous about injections and chatting to people in for their third lot of chemotherapy. I asked if she would object to being called a "supportive health care worker" rather than a spiritual healer. Somewhat reluctantly she said that she wouldn’t mind that. But it can’t be done because of the absurd box-ticking mentality of HR departments. There is no job-description for someone who holds hands with patients, and no formal qualifications. On the other hand, if you are sufficiently brainless, you can tick a box for a healer. Once again I wish that HR departments would not hinder academic integrity.
Steven Novella, at Science-Based medicine, has also written about this paper.
This post recounts a complicated story that started in January 2009, but has recently come to what looks like a happy ending. The story involves over a year’s writing of letters and meetings, but for those not interested in the details, I’ll start with a synopsis.
Synopsis of the synopsis
In January 2009, a course in "integrated medicine" was announced that, it was said, would be accredited by the University of Buckingham. The course was to be led by Drs Rosy Daniel and Mark Atkinson. So I sent an assessment of Rosy Daniel’s claims to "heal" cancer to Buckingham’s VC (president), Terence Kealey, After meeting Karol Sikora and Rosy Daniel, I sent an analysis of the course tutors to Kealey who promptly demoted Daniel, and put Prof Andrew Miles in charge of the course. The course went ahead in September 2009. Despite Miles’ efforts, the content was found to be altogether too alternative. The University of Buckingham has now terminated its contract with the "Faculty of Integrated Medicine", and the course will close. Well done.Buckingham.
- January 2009. I saw an announcement of a Diploma in Integrated Medicine, to be accredited by the University of Buckingham (UB). The course was to be run by Drs Rosy Daniel and Mark Atkinson of the College of Integrated Medicine, under the nominal directorship of Karol Sikora (UB’s Dean of Medicine). I wrote to Buckingham’s vice-chancellor (president), Terence Kealey, and attached a reprint of Ernst’s paper on carctol, a herbal cancer ‘remedy’ favoured by Daniiel.
- Unlike most vice-chancellors, Kealey replied at once and asked me to meet Sikora and Daniel. I met first Sikora alone, and then, on March 19 2009, both together. Rosy Daniel gave me a complete list of the speakers she’d chosen. Most were well-known alternative people, some, in my view, the worst sort of quack. After discovering who was to teach on the proposed course, I wrote a long document about the proposed speakers and sent it to the vice-chancellor of the University of Buckingham, Terence Kealey on March 23rd 2009.. Unlike most VCs, he took it seriously. At the end of this meeting I asked Sikora, who was in nominal charge of the course, how many of the proposed tutors he’d heard of. The answer was "none of them"
- Shortly before this meeting, I submitted a complaint to Trading Standards about Rosy Daniel’s commercial site, HealthCreation, for what seemed to me to be breaches of the Cancer Act 1939, by claims made for Carctol. Read the complaint.
- On 27th April 2009, I heard from Kealey that he’d demoted Rosy Daniel from being in charge of the Diploma and appointed Andrew Miles, who had recently been appointed as Buckingham’s Professor of Public Health Education and Policy &Associate Dean of Medicine (Public Health). Terence Kealey said "You’ve done us a good turn, and I’m grateful". Much appreciated. Miles said the course “needs in my view a fundamental reform of content. . . “
- Although Rosy Daniel had been demoted, she was still in charge of delivering the course at what had, by this time, changed its name to the Faculty of Integrated Medicine which, despite its name, is not part of the university.
- Throughout the summer I met Miles (of whom more below) several times and exchanged countless emails, but still didn’t get the revised list of speakers. The course went ahead on 30 September 2009. He also talked with Michael Baum and Edzard Ernst.
- By January 2010, Miles came to accept that the course was too high on quackery to be a credit to the university, and simply fired The Faculty of Integrated Medicine. Their contract was not renewed. Inspection of the speakers, even after revision of the course, shows why.
- As a consequence, it is rumoured that Daniel is trying to sell the course to someone else. The University of Middlesex, and unbelievably, the University of Bristol, have been mentioned, as well as Thames Valley University, the University of Westminster, the University of Southampton and the University of East London. Will the VCs of these institutions not learn something from Buckingham’s experience? It is to be hoped that they would at the very least approach Buckingham to ask pertinent questions? But perhaps a more likely contender for an organisation with sufficient gullibility is the Prince of Wales newly announced College of Integrated Medicine. [but see stop press]
The details of the story
The University of Buckingham (UB) is the only private university in the UK. Recently it announced its intention to start a school of medicine (the undergraduate component is due to start in September 2011). The dean of the new school is Karol Sikora.
Karol Sikora shot to fame after he appeared in a commercial in the USA. The TV commercial was sponsored by a far-right Republican campaign group, “Conservatives for Patients’ Rights” It designed to prevent the election of Barack Obama, by pouring scorn on the National Health Serrvice. A very curious performance. Very curious indeed. And then there was a bit of disagreement about the titles that he claimed to have.
As well as being dean of medicine at UB. Karol Sikora is also medical research director of CancerPartnersUK. a private cancer treatment company. He must be a very busy man.
Karol Sikora’s attitude to quackery is a mystery wrapped in an enigma. As well as being a regular oncologist, he is also a Foundation Fellow of that well known source of unreliable information, The Prince of Wales Foundation for Integrated Health. He spoke at their 2009 conference.
In the light of that, perhaps it is not, after all, so surprising thet the first action of UB’s medical school was to accredit a course a Diploma in Integrated Medicine. This course has been through two incarnations. The first prospectus (created 21 January 2009) advertised the course as being run by the British College of Integrated Medicine.But by the time that UB issued a press release in July 2009, the accredited outfit had changed its name to the Faculty of Integrated Medicine That grand title makes it sound like part of a university. It isn’t.
Rosy Daniel runs a company, Health Creation which, among other things, recommended a herbal concoction. Carctol. to "heal" cancer, . I wrote to Buckingham’s vice-chancellor (president), Terence Kealey, and attached a reprint of Ernst’s paper on Carctol. . Unlike most university vice-chancellors, he took it seriously. He asked me to meet Karol Sikora and Rosy Daniel to discuss it. After discovering who was teaching on this course, I wrote a document about their backgrounds and sent it to Terence Kealey. The outcome was that he removed Rosy Daniel as course director and appointed in her place Andrew Miles, with a brief to reorganise the course. A new prospectus, dated 4 September 2009, appeared. The course is not changed as much as I’d have hoped, although Miles assures me that while the lecture titles themselves may not have changed, he had ordered fundamental revisions to the teaching content and the teaching emphases.
In the new prospectus the British College of Integrated Medicine has been renamed as the Faculty of Integrated Medicine, but it appears to be otherwise unchanged. That’s a smart bit of PR. The word : “Faculty” makes it sound as though the college is part of a university. It isn’t. The "Faculty" occupies some space in the Apthorp Centre in Bath, which houses, among other things, Chiropract, Craniopathy (!) and a holistic vet,
The prospectus now starts thus.
The Advisory Board consists largely of well-know advocates of alternative medicine (more information about them below).
Most of these advisory board members are the usual promoters of magic medicine. But three of them seem quite surprising,Stafford Lightman, Nigel Sparrow and Nigel Mathers.
Stafford Lightman? Well actually I mentioned to him in April that his name was there and he asked for it to be removed, on the grounds that he’d had nothing to do with the course. It wasn’t removed for quite a while, but the current advisory board has none of these people. Nigel Sparrow and Nigel Mathers, as well as Lightman, sent letters of formal complaint to Miles and Terence Kealey, the VC of Buckingham, to complain that their involvement in Rosy Daniel’s set-up had been fundamentally misrepresented by Daniel. With these good scientists having extricated themselves from Daniel’s organisation, the FIM has only people who are firmly in the alternative camp (or quackery, as i’d prefer to call it). For example, people like Andrew Weil and George Lewith.
Andrew Weil, for example, while giving his address as the University of Arizona, is primarily a supplement salesman. He was recently reprimanded by the US Food and Drugs Administration
“Advertising on the site, the agencies said in the Oct. 15 letter, says “Dr. Weil’s Immune Support Formula can help maintain a strong defense against the flu” and claims it has “demonstrated both antiviral and immune-boosting effects in scientific investigation.”
The claims are not true, the letter said, noting the “product has not been approved, cleared, or otherwise authorized by FDA for use in the diagnosis, mitigation, prevention, treatment, or cure of the H1N1 flu virus.”
This isn’t the first time I’ve come across people’s names being used to support alternative medicine without the consent of the alleged supporter. There was, for example, the strange case of Dr John Marks and Patrick Holford.
Misrepresentation of this nature seems to be the order of the day. Could it be that people like Rosy Daniel are so insecure or, indeed, so unimportant within the Academy in real terms (where is there evidence of her objective scholarly or clinical stature?), that they seek to attach themselves, rather like limpets to fishing boats, to people of real stature and reputation, in order to boost their own or others’ view of themselves by a manner of proxy?
When the course was originally proposed, a brochure appeared. It said accreditation by the University of Buckingham was expected soon.
Not much detail appeared in the brochure, Fine words are easy to write but what matters is who is doing th teaching. So I wrote to the vice-chancellor of Buckingham, Terence Kealey. I attached a reprint of Ernst’s paper on carctol, a herbal cancer ‘remedy’ favoured by Daniel (download the cached version of her claims, now deleted).
Kealey is regarded in much of academia as a far-right maverick, because he advocates ideas such as science research should get no public funding,and that universities should charge full whack for student fees. He has, in fact, publicly welcomed the horrific cuts being imposed on the Academy by Lord Mandelson. His piece in The Times started
“Wonderful news. The Government yesterday cut half a billion pounds from the money it gives to universities”
though the first comment on it starts
"Considerable accomplishment: to pack all these logical fallacies and bad metaphors in only 400 words"
He and I are probably at opposite ends of the political spectrum. Yet he is the only VC who has been willing to talk about questions like this. Normally letters to vice-chancellors about junk degrees go unanswered. Not so with Kealey. I may disagree with a lot of his ideas, but he is certainly someone you can do business with.
Kealey responded quickly to my letter, sent in January 2009, pointing out that Rosy Daniel’s claims about Carctol could not be supported and were possibly illegal. He asked me to meet Sikora and Daniel. I met first Sikora alone, and then, on March 19 2009, both together. Rosy Daniel gave me a complete list of the speakers she’d chosen to teach on this new Diploma on IM.
After discovering who was to teach on the proposed course, I wrote a long document about the proposed speakers and sent it to Terence Kealey on March 23rd 2009. It contained many names that will be familiar to anyone who has taken an interest in crackpot medicine, combined with a surprisingly large element of vested financial interests. Unlike most VCs, Kealey took it seriously.
The remarkable thing about this meeting was that I asked Sikora how many names where familiar to him on the list of people who had been chosen by Rosy Daniel to teach on the course. His answer was "none of them". Since his name and picture feature in all the course descriptions, this seemed like dereliction of duty to me.
After seeing my analysis of the speakers, Terence Kealey reacted with admirable speed. He withdrew the original brochure, demoted Rosy Daniel (in principle anyway) and brought in Prof Andrew Miles to take responsibility for the course. This meant that he had to investigate the multiple conflicts of interests of the various speakers and to establish some sort of way forward in the ‘mess’ of what had been agreed before Miles’ appointment to Buckingham
Miles is an interesting character, a postdoctoral neuroendocrinologist, turned public health scientist. I’d come across him before as editor-in-chief of the Journal of Evaluation in Clinical Practice This is a curious journal that is devoted mainly to condemning Evidence Based Medicine. Much of its content seems to be in a style that I can only describe as post-modernist-influenced libertarian.
The argument turns on what you mean by ‘evidence’ and, in my opinion, Miles underestimates greatly the crucial problem of causality, a problem that can be solved only by randomisation, His recent views on the topic can be read here.
An article in Miles’ journal gives its flavour: "Andrew Miles, Michael Loughlin and Andreas Polychronis, Medicine and evidence: knowledge and action in clinical practice". Journal of Evaluation in Clinical Practice 2007, 13, 481–503 [download pdf]. This paper launches an attack on Ben Goldacre, in the following passage.
“Loughlin identifies Goldacre  as a particularly luminous example of a commentator who is able not only to combine audacity with outrage, but who in a very real way succeeds in manufacturing a sense of having been personally offended by the article in question. Such moralistic posturing acts as a defence mechanism to protect cherished assumptions from rational scrutiny and indeed to enable adherents to appropriate the ‘moral high ground’, as well as the language of ‘reason’ and ‘science’ as the exclusive property of their own favoured approaches. Loughlin brings out the Orwellian nature of this manoeuvre and identifies a significant implication.”
"If Goldacre and others really are engaged in posturing then their primary offence, at least according to the Sartrean perspective adopted by Murray et al. is not primarily intellectual, but rather it is moral. Far from there being a moral requirement to ‘bend a knee’ at the EBM altar, to do so is to violate one’s primary duty as an autonomous being.”
This attack on one of my heroes was occasioned because he featured one of the most absurd pieces of post-modernist bollocks ever, in his Guardian column in 2006. I had a go at the same paper on this blog, as well as an earlier one by Christine Barry, along the same lines. There was some hilarious follow-up on badscience.net. After this, it is understandable that I had not conceived a high opinion of Andrew Miles. I feared that Kealey might have been jumping out of the frying pan into the fire.
After closer acquaintance I have changed my mind, In the present saga Andrew Miles has done an excellent job. He started of sending me links to heaven knows how many papers on medical epistemology, to Papal Encyclicals on the proposed relationship between Faith and Reason and on more than one occasion articles from the Catholic Herald (yes, I did read it). This is not entirely surprising, as Miles is a Catholic priest as well as a public health academic, so has two axes to grind. But after six months of talking, he now sends me links to junk science sites of the sort that I might get from, ahem, Ben Goldacre.
Teachers on the course
Despite Andrew Miles best efforts, he came in too late to prevent much of the teaching being done in the parallel universe of alternative medicine, The University of Buckingham had a pre-Miles, legally-binding contract (now terminated) with the Faculty of Integrated Medicine, and the latter is run by Dr Rosy Daniel and Dr Mark Atkinson. Let’s take a look at their record.
Rosy Daniel BSc, MBBCh
Dr Rosy Daniel first came to my attention through her commercial web site, Health Creation. This site, among other things, promoted an untested herbal concoction, Carctol, for "healing" cancer.
Carctol: Profit before Patients? is a review by Edzard Ernst of the literature, such as it is, and concludes
Carctol and the media hype surrounding it must have given many cancer patients hope. The question is whether this is a good or a bad thing. On the one hand, all good clinicians should inspire their patients with hope . On the other hand, giving hope on false pretences is cruel and unethical. Rosy Daniel rightly points out that all science begins with observations . But all science then swiftly moves on and tests hypotheses. In the case of Carctol, over 20 years of experience in India and almost one decade of experience in the UK should be ample time to do this. Yet, we still have no data. Even the small number of apparently spectacular cases observed by Dr. Daniel have not been published in the medical literature.
On this basis I referred Health Creation to Trading Standards officer for a prima facie breach of the Cancer Act 1939. ]Download the complaint document]. Although no prosecution was brought by Trading Standards, they did request changes in the claims that were being made. Here is an example.
A Google search of the Health Creation site for “Carctol” gives a link
Dr Daniel has prescribed Carctol for years and now feels she is seeing a breakthrough. Dr Daniel now wants scientists to research the new herbal medicine
But going to the link produces
You are not authorized to access this page.
You can download the cached version of this page, which shows the sort of claims that were being made before Trading Standards Officers stepped in. There are now only a few oblique references to Carctol on the Health Creation site, e.g. here..
Both Rosy Daniel and Karol Sikora were speakers at the 2009 Princes’s Foundation Conference, in some odd company.
Mark Atkinson MBBS BSc (Hons) FRIPH
Dr Mark Atkinson is co-leader of the FiM course. He is also a supplement salesman, and he has promoted the Q-link pendant. The Q-link pendant is a simple and obvious fraud designed to exploit paranoia about WiFi killing you. When Ben Goldacre bought one and opened it. He found
“No microchip. A coil connected to nothing. And a zero-ohm resistor, which costs half a penny, and is connected to nothing.”
Nevertheless, Mark Atkinson has waxed lyrical about this component-free device.
“As someone who used to get tired sitting in front of computers and used to worry about the detrimental effects of external EMF’s, particularly as an avid user of mobile phones, I decided to research the various devices and technologies on the market that claim to strengthen the body’s subtle energy fields. It was Q Link that came out top. As a Q link wearer, I no longer get tired whilst at my computer, plus I’m enjoying noticeably higher energy levels and improved mental performance as a result of wearing my Q Link. I highly recommend it.” Dr Mark Atkinson, Holistic Medical Physician
Mark Atkinson is also a fan of Emo-trance. He wrote, In Now Magazine,
"I wanted you to know that of all the therapies I’ve trained in and approaches that I have used (and there’s been a lot) none have excited me and touched me so deeply than Emotrance."
"Silvia Hartmann’s technique is based on focusing your thoughts on parts of your body and guiding energy. It can be used for everything from insomnia to stress. The good news is that EmoTrance shows you how to free yourself from these stuck emotions and release the considerable amounts of energy that are lost to them."
Aha so this particular form of psychobabble is the invention of Silvia Hartmann. Silvia Hartmann came to my attention because her works feature heavily in on of the University of Westminster’s barmier “BSc” degrees, in ‘naturopaths’, described here. She is fanous, apart from Emo-trance, for her book Magic, Spells and Potions
“Dr Hartmann has created techniques that will finally make magic work for you in ways you never believed to be possible.”
Times Higher Education printed a piece with the title ‘Energy therapy’ project in school denounced as ‘psychobabble’. They’d phoned me a couple of days earlier to see whether I had an opinion about “Emotrance”. As it happens, I knew a bit about it because it had cropped up in a course given at, guess where, the University of Westminster . It seems that a secondary school had bought this extreme form of psychobabble. The comments on the Times Higher piece were unusually long and interesting.
It turned out that the inventor of “Emotrance”, Dr Silvia Hartmann PhD., not only wrote books about magic spells and potions, but also that her much vaunted doctorate had been bought from the Universal Life Church, current cost $29.99.
The rest of the teachers
The rest of the teachers on the course, despite valiant attempts at vetting by Andrew Miles, includes many names only too well-known to anybody who has taken and interest in pseudo-scientific medicine. Here are some of them.
Damien Downing:, even the Daily Mail sees through him. Enough said.
About Kim A. Jobst
Consultant, Wholystic Care Physician [sic!] , Medical Homoeopath, Specialist in Neurodegeneration and Dementia, using food state nutrition, diet and lifestyle to facilitate Healing and Growth;
Catherine Zollman, Well known ally of HRH and purveyer of woo.
Harald Walach, another homeopath, fond of talking nonsense about "quantum effects".
Nicola Hembry, a make-believe nutritionist and advocate of vitamin C and laetrile for cancer
Simon Mills, a herbalist who is inclined to diagnoses like “hot damp”, ro be treated with herbs that tend to “cool and dry.”
David Peters, of the University of Westminster. Enough said.
Nicola Robinson of Thames Valley University. Advocate of unevidenced treatmsnts.
Michael Dixon, of whom more here.
And last but not least,
The University of Buckingham removes accreditation of the Faculty of Integrated Medicine
The correspondence has been long and, at times, quite blunt. Here are a few quotations from it, The University of Buckingham, being private, is exempt from the Freedom of Information Act (2000) but nevertheless they have allowed me to reproduce the whole of the correspondence. The University, through its VC, Terence Keeley, has been far more open than places that are in principle subject to FOIA, but which, in practice, always try to conceal material. I may post the lot, as time permits, but meanwhile here are some extracts. They make uncomfortable reading for advocates of magic medicine.
Miles to Daniel, 8 Dec 2009
” . . . now that the University has taken his [Sikora’s] initial advice in trialing the DipSIM and has found it cost-ineffective, the way forward is therefore to alter that equation through more realistic financial contribution from IHT/FIM at Bath or to view the DipSIM as an experiment that has failed and which must give way to other more viable initiatives."
"The University is also able to confirm that we hold no interest in jointly developing any higher degrees on the study of IM with IHT/FIM at Bath. This is primarily because we are developing our own Master’s degree in Medicine of the Person in collaboration with various leading international societies and scholars including the WHO and which is based on a different school of thought. "
Miles to Daniel 15 Dec 2009
It appears that you have not fully assimilated the content of my earlier e-mails and so I will reiterate the points I have already made to you and add to them.
The DipSIM is an external activity – in fact, it is an external collaboration and nothing more. It is not an internal activity and neither is it in any way part of the medical school and neither will it become so and so the ‘normal rules’ of academic engagement and scholarly interchange do not apply. Your status is one of external collaborator and not one of internal or even visiting academic colleague. There is no “joint pursuit” of an academically rigorous study of IM by UB and IHT/FIM beyond the DipSIM and there are no plans, and never have been, for the “joint definition of research priorities” in IM. The DipSIM has been instituted on a trial basis and this has so far shown the DipSIM to be profoundly cost-ineffective for the University. You appear to misunderstand this – deliberately or otherwise."
Daniel to Miles 13 Jan 2010
"However, I am aware that weather permitting you and Karol will be off to the Fellows meeting for the newly forming National College (for which role I nominated you to Dr Michael Dixon and Prof David Peters.)
I have been in dialogue with Michael and Boo Armstrong from FIH and they are strongly in favour of forming a partnership with FIM so that we effectively become one of many new faculties within the College (which is why we change our name to FIM some months ago).
I have told Michael about the difficulties we are having and he sincerely hopes that we can resolve them so that we can all move forward as one. "
Miles to Daniel 20 Jan 2010
"Congratulations on the likely integration of your organisation into the new College of Integrative Health which will develop out of the Prince’s Foundation for Integrated Health. This
will make an entirely appropriate home for you for the longer term.
Your image of David Colquhoun "alive and kicking" as the Inquisitor General, radiating old persecutory energy and believing "priestess healers" (such as you describe youself) to be best "tortured, drowned and even burnt alive", will remain with me, I suspect, for many years to come (!). But then, as the Inquisitor General did say, ‘better to burn in this life than in the next’ (!). Overall, then, I reject your conclusion on the nature of the basis of my decision making and playfully suggest that it might form part of the next edition of Frankfurt’s recent volume ["On Bullshit] http://press.princeton.edu/titles/7929.html I hope you will forgive my injection of a little academic humour in an otherwise formal and entirely serious communication.
The nature of IM, with its foundational philosophy so vigorously opposed by mainstream medicine and the conitnuing national and international controversies which engulf homeopaths, acupuncturists, herbalists, naturopaths, transcendental meditators, therapeutic touchers, massagers, reflexologists, chiropractors, hypnotists, crystal users, yoga practitioners, aromatherapists, energy channelers, chinese medicine practitioners et al, can only bring the University difficulties as we seek to establish a formal and internationally recognised School of Medicine and School of Nursing.
I do not believe my comments in relation to governance at Bath are "offensive". They are, on the contrary, entirely accurate and of concern to the University. There have been resignations at senior level from your Board due to misrepresentation of your position and there has been a Trading Standards Authority investigation into further instances of misrepresentation. I am advised that an audit is underway of your compliance with the Authority’s instructions. You have therefore not dealt with my concerns, you have merely described them as "offensive".
I note from your e-mail that you are now in discussions with other universities and given the specific concerns of the University of Buckingham which I have dealt with exhaustively in this and other correspondences and the incompatibility of the developments at UB with the DipSIM and your own personal ambitions, etc., I believe you to have taken a very wise course and I wish you well in your negotiations. In these circumstances I feel it appropriate to enhance those negotiations by confirming that the University of Buckingham will not authorise the intake of a second cohort of students and that the relationship between IHT and the University will cease following the graduation of those members of the current course that are successful in their studies – the end of February 2011."
From Miles 2 Feb 2010
"Here is the list of teachers – you can subtract me (I withdrew from teaching when the antics ay Bath started) and also Professor John Cox (Former President of The Royal College of Psychiatrists and Former Secretary General of the World Psychiatric Association) who withdrew when he learned of some of the stuff going on…. Karol Sikora continues to teach. Michael Loughlin and Carmel Martin are both good colleagues of mine and, I can assure you – taught the students solid stuff! Michael taught medical epistemology and Carmel the emerging field of systems complexity in health services (Both of them have now withdrawn from teaching commitments).
The tutors shown are described by Rosy as the finest minds in IM teaching in the country. I interviewed tham all personally on (a) the basis of an updated CV & (b) via a 30 min telephone interview with me personally. Some were excluded from teaching because they were not qualified to do so academically (e.g. Boo Armstrong, Richard Falmer, not even a first degree, etc, etc., but gave a short presentation in a session presided over by an approved teacher) and others were approved because of their academic qualifications, PhD, MD, FRCP etc etc etc) and activity within the IM field. Each approved teacher was issued with highly specific teaching guidance form me (no bias, reference to opposing schools of thought, etc etc) and each teacher was required to complete and sign a Conflicts of Interest form. All of these documentations are with me here. Short of all this governance it’s impossible to bar them from teaching because who else would then do it?! Anyway, the end is in sight – Hallelujah! "
From Miles 19 Feb 2010
Just got back to the office after an excellent planning meeting for the new Master’s Degree in Person-centred Medicine and a hearty (+ alcoholic) lunch at the Ath! Since I shall never be a FRS, the Ath seems to me the next best ‘club’ (!). Michael Baum is part of the steering committee and you might like to take his thoughts on the direction of the programme. Our plans may even find their way into your Blog as an example of how to do things (vs how not to do things, i.e. CAM, IM, etc!). This new degree will sit well alongside the new degrees in Public Health – i.e. the population/utilitarian outlook of PH versus the individual person-centred approach., etc. "
And an email from a senior UB spokesperson
"Rumour has it that now that Buckingham has dismissed the ‘priestess healer of Bath’, RD [Rosy Daniel] , explorations are taking place with other universities, most of which are subject to FoI request from DC at the time of writing. Will these institutions have to make the same mistakes Buckingham did before taking the same action? Rumour also has it that RD changed the name of her institution to FIM in order to fit neatly into the Prince’s FIH, a way, no doubt, of achieving ‘protection’ and ‘accreditation’ in parallel with particularly lucrative IM ‘education’ (At £9,000 a student and with RD’s initial course attracting 20 mainly GPs, that’s £180,00 – not bad business…. And Buckingham’s ‘share of this? £12,000!”
The final bombshell; even the Prince of Wales’ FIH rejects Daniel and Atkinson?
Only today (31 March) I was sent, from a source that I can’t reveal, an email which comes from someone who "represent the College and FIH . . . ".. This makes it clear that the letter comes from the Prince of Wales’ Foundation for Integrated Health
Dr Rosy Daniel BSc MBBCh
Director of the Faculty of Integrated Medicine
Medical Director Health Creation
30th March 2010
RE: Your discussion paper and recent correspondence
Thank you for meeting with [XXXXXX] and myself this evening to discuss your proposals concerning a future relationship between your Faculty of Integrated Medicine and the new College. As you know, he and I have been asked to represent the College and FIH in this matter.
We are aware of difficulties facing your organisations and the FIM DipSIM course. As a consequence of these, it is not possible for the College to enter into an association with you, any of your organisations nor the DipSIM course at the present time. It would, therefore, be wrong to represent to others that any such association has been agreed.
You will appreciate that, in these circumstances, you will not receive an invitation to the meeting of 15th April 2010 nor to other planned events.
I am sorry to disappoint you in this matter.
I’ll confess to feeling almost a little guilty for having appeared to persecute the particular individuals involved in thie episode. But patients are involved and so is the law, and both of these are more important than individuals, The only unfair aspect is that, while it seems that even the Prince of Wales’ Foundation for Integrated Health has rejected Daniel and Atkinson, that Foundation embraces plenty of people who are just as deluded, and potentially dangerous, as those two. The answer to that problem is for the Prince to stop endorsing treatments that don’t work.
As for the University of Buckingham. Well, despite the ‘right wing maverick’ Kealey and the ‘anti-evidence’ Miles, I really think they’ve done the right thing. They’ve listened, they’ve maintained academic rigour and they’ve released all information for which I asked and a lot more. Good for them, I say.
15 April 2010. This story was reported by Times Higher Education, under the title “It’s terminal for integrated medicine diploma“. That report didn’t attract comments. But on 25th April Dr Rosy Daniel replied with “‘Terminal’? We’ve only just begun“. This time there were some feisty responses. Dr Daniel really should check her facts before getting into print.
3 March 2011. Unsurprisingly, Dr Daniel is up and running again, under the name of the British College of Integrated Medicine. The only change seems to be that Mark Atkinson has jumped ship altogether, and, of course, she is now unable to claim endorsement by Buckingham, or any other university. Sadly, though, Karol Sikora seems to have learned nothing from the saga related above. He is still there as chair of the Medical Advisory Board, along with the usual suspects mentioned above.
This article has been reposted on The Winnower, and now has a digital object identifier DOI: 10.15200/winn.142934.47856
This post is not about quackery, nor university politics. It is about inference, How do we know what we should eat? The question interests everyone, but what do we actually know? Not as much as you might think from the number of column-inches devoted to the topic. The discussion below is a synopsis of parts of an article called “In praise of randomisation”, written as a contribution to a forthcoming book, Evidence, Inference and Enquiry.
About a year ago just about every newspaper carried a story much like this one in the Daily Telegraph,
Sausage a day can increase bowel cancer risk
By Rebecca Smith, Medical Editor Last Updated: 1:55AM BST 31/03/2008
What, I wondered, was the evidence behind these dire warnings. They did not come from a lifestyle guru, a diet faddist or a supplement salesman. This is nothing to do with quackery. The numbers come from the 2007 report of the World Cancer Research Fund and American Institute for Cancer Research, with the title ‘Food, Nutrition, Physical Activity, and the Prevention of Cancer: a Global Perspective‘. This is a 537 page report with over 4,400 references. Its panel was chaired by Professor Sir Michael Marmot, UCL’s professor of Epidemiology and Public Health. He is a distinguished epidemiologist, renowned for his work on the relation between poverty and health.
Nevertheless there has never been a randomised trial to test the carcinogenicity of bacon, so it seems reasonable to ask how strong is the evidence that you shouldn’t eat it? It turns out to be surprisingly flimsy.
In praise of randomisation
Everyone knows about the problem of causality in principle. Post hoc ergo propter hoc; confusion of sequence and consequence; confusion of correlation and cause. This is not a trivial problem. It is probably the main reason why ineffective treatments often appear to work. It is traded on by the vast and unscrupulous alternative medicine industry. It is, very probably, the reason why we are bombarded every day with conflicting advice on what to eat. This is a bad thing, for two reasons. First, we end up confused about what we should eat. But worse still, the conflicting nature of the advice gives science as a whole a bad reputation. Every time a white-coated scientist appears in the media to tell us that a glass of wine per day is good/bad for us (delete according to the phase of the moon) the general public just laugh.
In the case of sausages and bacon, suppose that there is a correlation between eating them and developing colorectal cancer. How do we know that it was eating the bacon that caused the cancer – that the relationship is causal? The answer is that there is no way to be sure if we have simply observed the association. It could always be that the sort of people who eat bacon are also the sort of people who get colorectal cancer. But the question of causality is absolutely crucial, because if it is not causal, then stopping eating bacon won’t reduce your risk of cancer. The recommendation to avoid all processed meat in the WCRF report (2007) is sensible only if the relationship is causal. Barker Bausell said:
[Page39] “But why should nonscientists care one iota about something as esoteric as causal inference? I believe that the answer to this question is because the making of causal inferences is part of our job description as Homo Sapiens.”
That should be the mantra of every health journalist, and every newspaper reader.
The essential basis for causal inference was established over 70 years ago by that giant of statistics Ronald Fisher, and that basis is randomisation. Its first popular exposition was in Fisher’s famous book, The Design of Experiments (1935). The Lady Tasting Tea has become the classical example of how to design an experiment. .
Briefly, a lady claims to be able to tell whether the milk was put in the cup before or after the tea was poured. Fisher points out that to test this you need to present the lady with an equal number of cups that are ‘milk first’ or ‘tea first’ (but otherwise indistinguishable) in random order, and count how many she gets right. There is a beautiful analysis of it in Stephen Senn’s book, Dicing with Death: Chance, Risk and Health. As it happens, Google books has the whole of the relevant section Fisher’s tea test (geddit?), but buy the book anyway. Such is the fame of this example that it was used as the title of a book, The Lady Tasting Tea was published by David Salsburg (my review of it is here)
Most studies of diet and health fall into one of three types, case-control studies, cohort (or prospective) studies, or randomised controlled trials (RCTs). Case-control studies are the least satisfactory: they look at people who already have the disease and look back to see how they differ from similar people who don’t have the disease. They are retrospective. Cohort studies are better because they are prospective: a large group of people is followed for a long period and their health and diet is recorded and later their disease and death is recorded. But in both sorts of studies,each person decides for him/herself what to eat or what drugs to take. Such studies can never demonstrate causality, though if the effect is really big (like cigarette-smoking and lung cancer) they can give a very good indication. The difference in an RCT is that each person does not choose what to eat, but their diet is allocated randomly to them by someone else. This means that, on average, all other factors that might influence the response are balanced equally between the two groups. Only RCTs can demonstrate causality.
Randomisation is a rather beautiful idea. It allows one to remove, in a statistical sense, bias that might result from all the sources that you hadn’t realised were there. If you are aware of a source of bias, then measure it. The danger arises from the things you don’t know about, or can’t measure (Senn, 2004; Senn, 2003). Although it guarantees freedom from bias only in a long run statistical sense, that is the best that can be done. Everything else is worse.
Ben Goldacre has referred memorably to the newspapers’ ongoing “Sisyphean task of dividing all the inanimate objects in the world into the ones that either cause or cure cancer” (Goldacre, 2008). This has even given rise to a blog. “The Daily Mail Oncological Ontology Project“. The problem arises in assessing causality.
It wouldn’t be so bad if the problem were restricted to the media. It is much more worrying that the problem of establishing causality often seems to be underestimated by the authors of papers themselves. It is a matter of speculation why this happens. Part of the reason is, no doubt, a genuine wish to discover something that will benefit mankind. But it is hard not to think that hubris and self-promotion may also play a role. Anything whatsoever that purports to relate diet to health is guaranteed to get uncritical newspaper headlines.
At the heart of the problem lies the great difficulty in doing randomised studies of the effect of diet and health. There can be no better illustration of the vital importance of randomisation than in this field. And, notwithstanding the generally uncritical reporting of stories about diet and health, one of the best accounts of the need for randomisation was written by a journalist, Gary Taubes, and it appeared in the New York Times (Taubes, 2007).
The case of hormone replacement therapy
In the 1990s hormone replacement therapy (HRT) was recommended not only to relieve the unpleasant symptoms of the menopause, but also because cohort studies suggested that HRT would reduce heart disease and osteoporosis in older women. For these reasons, by 2001, 15 million US women (perhaps 5 million older women) were taking HRT (Taubes, 2007). These recommendations were based largely on the Harvard Nurses’ Study. This was a prospective cohort study in which 122,000 nurses were followed over time, starting in 1976 (these are the ones who responded out of the 170,000 requests sent out). In 1994, it was said (Manson, 1994) that nearly all of the more than 30 observational studies suggested a reduced risk of coronary heart disease (CHD) among women receiving oestrogen therapy. A meta-analysis gave an estimated 44% reduction of CHD. Although warnings were given about the lack of randomised studies, the results were nevertheless acted upon as though they were true. But they were wrong. When proper randomised studies were done, not only did it turn out that CHD was not reduced: it was actually increased.
The Women’s Health Initiative Study (Rossouw et al., 2002) was a randomized double blind trial on 16,608 postmenopausal women aged 50-79 years and its results contradicted the conclusions from all the earlier cohort studies. HRT increased risks of heart disease, stroke, blood clots, breast cancer (though possibly helped with osteoporosis and perhaps colorectal cancer). After an average 5.2 years of follow-up, the trial was stopped because of the apparent increase in breast cancer in the HRT group. The relative risk (HRT relative to placebo) of CHD was 1.29 (95% confidence interval 1.02 to 1.63) (286 cases altogether) and for breast cancer 1.26 (1.00 -1.59) (290 cases). Rather than there being a 44% reduction of risk, it seems that there was actually a 30% increase in risk. Notice that these are actually quite small risks, and on the margin of statistical significance. For the purposes of communicating the nature of the risk to an individual person it is usually better to specify the absolute risk rather than relative risk. The absolute number of CHD cases per 10,000 person-years is about 29 on placebo and 36 on HRT, so the increased risk of any individual is quite small. Multiplied over the whole population though, the number is no longer small.
Several plausible reasons for these contradictory results are discussed by Taubes,(2007): it seems that women who choose to take HRT are healthier than those who don’t. In fact the story has become a bit more complicated since then: the effect of HRT depends on when it is started and on how long it is taken (Vandenbroucke, 2009).
This is perhaps one of the most dramatic illustrations of the value of randomised controlled trials (RCTs). Reliance on observations of correlations suggested a 44% reduction in CHD, the randomised trial gave a 30% increase in CHD. Insistence on randomisation is not just pedantry. It is essential if you want to get the right answer.
Having dealt with the cautionary tale of HRT, we can now get back to the ‘Sisyphean task of dividing all the inanimate objects in the world into the ones that either cause or cure cancer’.
The case of processed meat
The WCRF report (2007) makes some pretty firm recommendations.
- Don’t get overweight
- Be moderately physically active, equivalent to brisk walking for at least 30 minutes every day
- Consume energy-dense foods sparingly. Avoid sugary drinks. Consume ‘fast foods’ sparingly, if at all
- Eat at least five portions/servings (at least 400 g or 14 oz) of a variety of non-starchy vegetables and of fruits every day. Eat relatively unprocessed cereals (grains) and/or pulses (legumes) with every meal. Limit refined starchy foods
- People who eat red meat to consume less than 500 g (18 oz) a week, very little if any to be processed.
- If alcoholic drinks are consumed, limit consumption to no more than two drinks a day for men and one drink a day for women.
- Avoid salt-preserved, salted, or salty foods; preserve foods without using salt. Limit consumption of processed foods with added salt to ensure an intake of less than 6 g (2.4 g sodium) a day.
- Dietary supplements are not recommended for cancer prevention.
These all sound pretty sensible but they are very prescriptive. And of course the recommendations make sense only insofar as the various dietary factors cause cancer. If the association is not causal, changing your diet won’t help. Note that dietary supplements are NOT recommended. I’ll concentrate on the evidence that lies behind “People who . . . very little if any to be processed.”
The problem of establishing causality is dicussed in the report in detail. In section 3.4 the report says
” . . . causal relationships between food and nutrition, and physical activity can be confidently inferred when epidemiological evidence, and experimental and other biological findings, are consistent, unbiased, strong, graded, coherent, repeated, and plausible.”
The case of processed meat is dealt with in chapter 4.3 (p. 148) of the report.
“Processed meats” include sausages and frankfurters, and ‘hot dogs’, to which nitrates/nitrites or other preservatives are added, are also processed meats. Minced meats sometimes, but not always, fall inside this definition if they are preserved chemically. The same point applies to ‘hamburgers’.
The evidence for harmfulness of processed meat was described as “convincing”, and this is the highest level of confidence in the report, though this conclusion has been challenged (Truswell, 2009) .
How well does the evidence obey the criteria for the relationship being causal?
Twelve prospective cohort studies showed increased risk for the highest intake group when compared to the lowest, though this was statistically significant in only three of them. One study reported non-significant decreased risk and one study reported that there was no effect on risk. These results are summarised in this forest plot (see also Lewis & Clark, 2001)
Each line represents a separate study. The size of the square represents the precision (weight) for each. The horizontal bars show the 95% confidence intervals. If it were possible to repeat the observations many times on the same population, the 95% CL would be different on each repeat experiment, but 19 out of 20 (95%) of the intervals would contain the true value (and 1 in 20 would not contain the true value). If the bar does not overlap the vertical line at relative risk = 1 (i.e. no effect) this is equivalent to saying that there is a statistically significant difference from 1 with P < 0.05. That means, very roughly, that there is a 1 in 20 chance of making a fool of yourself if you claim that the association is real, rather than being a result of chance (more detail here),
There is certainly a tendency for the relative risks to be above one, though not by much, Pooling the results sounds like a good idea. The method for doing this is called meta-analysis .
Meta-analysis was possible on five studies, shown below. The outcome is shown by the red diamond at the bottom, labelled “summary effect”, and the width of the diamond indicates the 95% confidence interval. In this case the final result for association between processed meat intake and colorectal cancer was a relative risk of 1.21 (95% CI 1.04–1.42) per 50 g/day. This is presumably where the headline value of a 20% increase in risk came from.
Support came from a meta-analysis of 14 cohort studies, which reported a relative risk for processed meat of 1.09 (95% CI 1.05 – 1.13) per 30 g/day (Larsson & Wolk, 2006). Since then another study has come up with similar numbers (Sinha etal. , 2009). This consistency suggests a real association, but it cannot be taken as evidence for causality. Observational studies on HRT were just as consistent, but they were wrong.
The accompanying editorial (Popkin, 2009) points out that there are rather more important reasons to limit meat consumption, like the environmental footprint of most meat production, water supply, deforestation and so on.
So the outcome from vast numbers of observations is an association that only just reaches the P = 0.05 level of statistical significance. But even if the association is real, not a result of chance sampling error, that doesn’t help in the least in establishing causality.
There are two more criteria that might help, a good relationship between dose and response, and a plausible mechanism.
Dose – response relationship
It is quite possible to observe a very convincing relationship between dose and response in epidemiological studies, The relationship between number of cigarettes smoked per day and the incidence of lung cancer is one example. Indeed it is almost the only example.
Doll & Peto, 1978
There have been six studies that relate consumption of processed meat to incidence of colorectal cancer. All six dose-response relationships are shown in the WCRG report. Here they are.
This Figure was later revised to
This is the point where my credulity begins to get strained. Dose – response curves are part of the stock in trade of pharmacologists. The technical description of these six curves is, roughly, ‘bloody horizontal’. The report says “A dose-response relationship was also apparent from cohort studies that measured consumption in times/day”. I simply cannot agree that any relationship whatsoever is “apparent”.
They are certainly the least convincing dose-response relationships I have ever seen. Nevertheless a meta-analysis came up with a slope for response curve that just reached the 5% level of statistical significance.
The conclusion of the report for processed meat and colorectal cancer was as follows.
“There is a substantial amount of evidence, with a dose-response relationship apparent from cohort studies. There is strong evidence for plausible mechanisms operating in humans. Processed meat is a convincing cause of colorectal cancer.”
But the dose-response curves look appalling, and it is reasonable to ask whether public policy should be based on a 1 in 20 chance of being quite wrong (1 in 20 at best –see Senn, 2008). I certainly wouldn’t want to risk my reputation on odds like that, never mind use it as a basis for public policy.
So we are left with plausibility as the remaining bit of evidence for causality. Anyone who has done much experimental work knows that it is possible to dream up a plausible explanation of any result whatsoever. Most are wrong and so plausibility is a pretty weak argument. Much play is made of the fact that cured meats contain nitrates and nitrites, but there is no real evidence that the amount they contain is harmful.
The main source of nitrates in the diet is not from meat but from vegetables (especially green leafy vegetables like lettuce and spinach) which contribute 70 – 90% of total intake. The maximum legal content in processed meat is 10 – 25 mg/100g, but lettuce contains around 100 – 400 mg/100g with a legal limit of 200 – 400 mg/100g. Dietary nitrate intake was not associated with risk for colorectal cancer in two cohort studies.(Food Standards Agency, 2004; International Agency for Research on Cancer, 2006).
To add further to the confusion, another cohort study on over 60,000 people compared vegetarians and meat-eaters. Mortality from circulatory diseases and mortality from all causes were not detectably different between vegetarians and meat eaters (Key et al., 2009a). Still more confusingly, although the incidence of all cancers combined was lower among vegetarians than among meat eaters, the exception was colorectal cancer which had a higher incidence in vegetarians than in meat eaters (Key et al., 2009b).
Mente et al. (2009) compared cohort studies and RCTs for effects of diet on risk of coronary heart disease. “Strong evidence” for protective effects was found for intake of vegetables, nuts, and “Mediterranean diet”, and harmful effects of intake of trans–fatty acids and foods with a high glycaemic index. There was also a bit less strong evidence for effects of mono-unsaturated fatty acids and for intake of fish, marine ω-3 fatty acids, folate, whole grains, dietary vitamins E and C, beta carotene, alcohol, fruit, and fibre. But RCTs showed evidence only for “Mediterranean diet”, and for none of the others.
As a final nail in the coffin of case control studies, consider pizza. According to La Vecchia & Bosetti (2006), data from a series of case control studies in northern Italy lead to: “An inverse association was found between regular pizza consumption (at least one portion of pizza per week) and the risk of cancers of the digestive tract, with relative risks of 0.66 for oral and pharyngeal cancers, 0.41 for oesophageal, 0.82 for laryngeal, 0.74 for colon and 0.93 for rectal cancers.”
What on earth is one meant to make of this? Pizza should be prescribable on the National Health Service to produce a 60% reduction in oesophageal cancer? As the authors say “pizza may simply represent a general and aspecific indicator of a favourable Mediterranean diet.” It is observations like this that seem to make a mockery of making causal inferences from non-randomised studies. They are simply uninterpretable.
Is the observed association even real?
The most noticeable thing about the effects of red meat and processed meat is not only that they are small but also that they only just reach the 5 percent level of statistical significance. It has been explained clearly why, in these circumstances, real associations are likely to be exaggerated in size (Ioannidis, 2008a; Ioannidis, 2008b; Senn, 2008). Worse still, there as some good reasons to think that many (perhaps even most) of the effects that are claimed in this sort of study are not real anyway (Ioannidis, 2005). The inflation of the strength of associations is expected to be bigger in small studies, so it is noteworthy that the large meta-analysis by Larsson & Wolk, 2006 comments “In the present meta-analysis, the magnitude of the relationship of processed meat consumption with colorectal cancer risk was weaker than in the earlier meta-analyses”.
This is all consistent with the well known tendency of randomized clinical trials to show initially a good effect of treatment but subsequent trials tend to show smaller effects. The reasons, and the cures, for this worrying problem are discussed by Chalmers (Chalmers, 2006; Chalmers & Matthews, 2006; Garattini & Chalmers, 2009)
What do randomized studies tell us?
The only form of reliable evidence for causality comes from randomised controlled trials. The difficulties in allocating people to diets over long periods of time are obvious and that is no doubt one reason why there are far fewer RCTs than there are observational studies. But when they have been done the results often contradict those from cohort studies. The RCTs of hormone replacement therapy mentioned above contradicted the cohort studies and reversed the advice given to women about HRT.
Three more illustrations of how plausible suggestions about diet can be refuted by RCTs concern nutritional supplements and weight-loss diets
Many RCTs have shown that various forms of nutritional supplement do no good and may even do harm (see Cochrane reviews). At least we now know that anti-oxidants per se do you no good. The idea that anti-oxidants might be good for you was never more than a plausible hypothesis, and like so many plausible hypotheses it has turned out to be a myth. The word anti-oxidant is now no more than a marketing term, though it remains very profitable for unscrupulous salesmen.
The randomised Women’s Health Initiative Dietary Modification Trial (Prentice et al., 2007; Prentice, 2007) showed minimal effects of dietary fat on cancer, though the conclusion has been challenged on the basis of the possible inaccuracy of reported diet (Yngve et al., 2006).
Contrary to much dogma about weight loss (Sacks et al., 2009) found no differences in weight loss over two years between four very different diets. They assigned randomly 811 overweight adults to one of four diets. The percentages of energy derived from fat, protein, and carbohydrates in the four diets were 20, 15, and 65%; 20, 25, and 55%; 40, 15, and 45%; and 40, 25, and 35%. No difference could be detected between the different diets: all that mattered for weight loss was the total number of calories. It should be added, though, that there were some reasons to think that the participants may not have stuck to their diets very well (Katan, 2009).
The impression one gets from RCTs is that the details of diet are not anything like as important as has been inferred from non-randomised observational studies.
So does processed meat give you cancer?
After all this, we can return to the original question. Do sausages or bacon give you colorectal cancer? The answer, sadly, is that nobody really knows. I do know that, on the basis of the evidence, it seems to me to be an exaggeration to assert that “The evidence is convincing that processed meat is a cause of bowel cancer”.
In the UK there were around 5 cases of colorectal cancer per 10,000 population in 2005, so a 20% increase, even if it were real, and genuinely causative. would result in 6 rather than 5 cases per 10,000 people, annually. That makes the risk sound trivial for any individual. On the other hand there were 36,766 cases of colorectal cancer in the UK in 2005. A 20% increase would mean, if the association were causal, about 7000 extra cases as a result of eating processed meat, but no extra cases if the association were not causal.
For the purposes of public health policy about diet, the question of causality is crucial. One has sympathy for the difficult decisions that they have to make, because they are forced to decide on the basis of inadequate evidence.
If it were not already obvious, the examples discussed above make it very clear that the only sound guide to causality is a properly randomised trial. The only exceptions to that are when effects are really big. The relative risk of lung cancer for a heavy cigarette smoker is 20 times that of a non-smoker and there is a very clear relationship between dose (cigarettes per day) and response (lung cancer incidence), as shown above. That is a 2000% increase in risk, very different from the 20% found for processed meat (and many other dietary effects). Nobody could doubt seriously the causality in that case.
The decision about whether to eat bacon and sausages has to be a personal one. It depends on your attitude to the precautionary principle. The observations do not, in my view, constitute strong evidence for causality, but they are certainly compatible with causality. It could be true so if you want to be on the safe side then avoid bacon. Of course life would not be much fun if your actions were based on things that just could be true.
My own inclination would be to ignore any relative risk based on observational data if it was less than about 2. The National Cancer Institute (Nelson, 2002) advises that relative risks less than 2 should be “viewed with caution”, but fails to explain what “viewing with caution” means in practice, so the advice isn’t very useful.
In fact hardly any of the relative risks reported in the WCRF report (2007) reach this level. Almost all relative risks are less than 1.3 (or greater than 0.7 for alleged protective effects). Perhaps it is best to stop worrying and get on with your life. At some point it becomes counterproductive to try to micromanage `people’s diet on the basis of dubious data. There is a price to pay for being too precautionary. It runs the risk of making people ignore information that has got a sound basis. It runs the risk of excessive medicalisation of everyday life. And it brings science itself into disrepute when people laugh at the contradictory findings of observational epidemiology.
The question of how diet and other ‘lifestyle interventions’ affect health is fascinating to everyone. There is compelling reason to think that it matters. For example one study demonstrated that breast cancer incidence increased almost threefold in first-generation Japanese women who migrated to Hawaii, and up to fivefold in the second generation (Kolonel, 1980). Since then enormous effort has been put into finding out why. The first great success was cigarette smoking but that is almost the only major success. Very few similar magic bullets have come to light after decades of searching (asbestos and mesothelioma, or UV radiation and skin cancer count as successes).
The WCRF report (2007) has 537 pages and over 4400 references and we still don’t know.
Sometimes I think we should say “I don’t know” rather more often.
Risk The Science and Politics of Fear, Dan Gardner. Virgin
Some bookmarks about diet and supplements
Dan Gardner, the author of Risk, seems to like the last line at least, according to his blog.
Report of the update, 2010
The 2010 report has been updated in WCRF/AICR Systematic Literature Review Continuous Update Project Report [big pdf file]. This includes studies up to May/June 2010.
The result of addition of the new data was to reduce slightly the apparent risk from eating processed meat from 1.21 (95% CI = 1.04-1.42) in the original study to 1.18 (95% CI = 1.10-1.28) in the update. The change is too small to mean much, though it is in direction expected for false correlations. More importantly, the new data confirm that the dose-response curves are pathetic. The evidence for causality is weakened somewhat by addition of the new data.
Dose-response graph of processed meat and colorectal cancer