Discussion by: Nunbeliever

Last year known skeptic Ben Goldacre talked about medical science on TED and how negative results are often not reported in scientific journals (http://www.youtube.com/watch?v=RKmxL8VYy0M).

This talk made me think about medical science in general and the use of meta analysis which seems to be the corner stone of modern medicine. I'm not a natural scientist and as such not an expert on statistics, but the whole idea of meta analysis strikes me as unscientific. I understand that it might be the only method available at times, but it seems very strange that medical science has adopted meta analysis as a basic concept. To me it sounds like medical researchers have adopted an attitude of "fix it later". Or in other words, they don't even try to conduct studies that are reliable on their own. I can't help to think that corporate interests might have something to do with this. Have we created a system where economical interests are more important than reliable results?

It would be interesting to hear from people with experience in this field what sample sizes, margins of error, confidence levels and other statistical properties are deemed acceptable within medical science. In my opinion the strong emphasis on meta analysis seems to indicate that medical science is not characterized by scientific rigorousity.

It’s an interesting problem. There is a perception, I don’t know how much truth there is to it, that the pharmaceuticals industry is willing to trade scientific thoroughness for monetary gain. My guess is that the few cases of this that happen get a lot more exposure than the majority of times where the research is done properly.

I also would be interested to hear views from those with experience in the medical field about the levels of rigour involved in medical research and meta analyses.

In reply to #1 by Stuart Coyle:Believe it or not it is the vast majority of pharmo research that is like this in the sense that they attempt only to achieve marketable results not significant results, this is why the Cure for AIDs is shot down as dangerous and risky while the temporary and expensive treatment that do not cure but simply subdue AIDs are held in high hopes.

Or even better yet take for example the refusal of drug companies to allow for cheap versions of their drugs to be sold in third world countries. They litterally facilitate the evolution and progression of Things lie TB into super diseases that are now begining to infect people in western Countries because we no longer have useable cures simply because these drug companies refused to sell cheap versions in other countries out of fear that “People in the US might demand to have access to the drugs at the same cheap prices” The same is true with Malaria, HIV, Dengue, and other tropical disease, these drug companies are sitting on cures for many diseases and it would be possible if they cooperated with the UN to completely eradicate entire disease from the face of the earth and yet they do not out of fear of losing profits.

Lawrence Lessig writes a very compelling peice on this in his book Free Culture. Where he lays out exactly the sort of monetary mind that goes into this sort of abuse of copyrights.

It only takes a few minutes on any given TV channel before you see an advertisement for some sort of drug, in fact many of them are prescription drugs you can only get from your doctor and yet these drug companies have no problems telling you that you need these drugs so you go to your doctor telling them your symptoms based on the TV add you saw and you ask for the drug that was on TV why because these drug companies explicitly misinform so that whether you need their drug or not you will buy it because you feel like you need it. This advertisement is actually illegal in many European countries for exactly this purpose and to show what greater harm it causes just look at the instance of things like meth in the US meth is growing in the US and yet it has barely even touched europe why? because they do not go around telling people to buy these drugs that contain such potent and harmfull chemicals.

Medical experiments are expensive. This puts a severe constraint on the sample size.

If a under size study suggests a positive result, then another group can do a similar study. A different group will take out any hidden biases of the first group that was not part of the published protocol. Combining the data from both studies gives higher confidence in the results. It is perfectly valid. There might be problems if the studies vary in unknown ways, and it may be better if one highly competent team carried out a double size study in the first place, but meta-analysis is a pragmatic solution to the cost and time constraints.

It doesn’t matter if it is corporations or govt.organisations like the NHS – medical experiments are still very expensive.

NHS clinical trials units have (had?) achieved the highest standards. They are statistically robust, and test on hundreds to thousands of subjects/patients as necessary. Such drug trials cost multiple millions, and are commissioned by drug companies/medical device manufacturers. The trials have to be successfully completed for uptake by the NHS. That news is about 10-15 years old, but after the first few successes. I have no idea where the program is now as my source retired. From the odd comment from the Tory Govt. (ignoring NICE (National Institute for Clinical Excellence) over cancer drugs, and continuing funding funding for homoeopathy) I suspect they are taking it apart.

I have less faith in in-house drug company trials. There have been a few cases in the news. Its like asking a naughty school child to mark their own homework.

In reply to #2 by God fearing Atheist:Thank you. That certainly explains a few things with regard to medical research. Still, with regard to confidence levels I really can’t see how two studies are better than one. For example, say that you conduct two studies and get a certain result with a 95% confidence level in both studies. In each study there is a 5% chance of the results being due to random chance. If you study them both together in a meta-analysis, isn’t there still a 5% chance of the results being due to random chance? If that is true. Then really you have gained nothing by putting them together, unless you actually combine the data from both these studies and do the analysis all over again. Which would result in a higher confidence level. Am I wrong about this???

In reply to #5 by Nunbeliever:The error is the likelihood that this particular study is wrong out of a total number of studies one could do. A 5% error margin means that, for every twenty studies using these parameters, one of them is likely to provide erroneous results. So long as the studies don’t differ too much in outline, and are sound to begin with, then an increasing number of studies reduces the chances that the results are total piffle. Like this:

There are two studies with error margins of 5% each. The chances that they’re both delivering good results is the multiplied probability of confidence of both studies, in this case 95% times 95%, which gives 0.9025. The chances that just one of them has erroneous results is 95% multiplied by 5% multiplied by two (because there’s one probability for Study 1 being the wrong one, and one probability for Study 2 being the wrong one), which gives 0.095. Lastly, the chances that both of them have erroneous results is 5% times 5%, which gives 0.0025. So the odds that they’re both right are 90.25%, the odds that at least one of them is wrong is 9.5%, and the odds that both of them are wrong is 0.25%. Thus, combining studies decreases the odds that they’re all perfectly correct (which is understandable), but also decreases the odds of total disaster; that they’re all perfectly worthless because they’re all riddled with errors.

Knowing that, I think you can see what will happen if more studies are added. That’s why replicating studies is so important.

In reply to #6 by Zeuglodon:Thank you so much. That was a very good explanation which really cleared up some things I’ve been pondering

In reply to #6 by Zeuglodon:I have a request for you, since you seem to have knowledge of the statistical methods used in meta-analysis. In my former case I talked about two studies with positive results being compared or added to a meta-study. Often though the results are much less conclusive.

Say for example that you have ten methodologically sound studies with the confidence level 95 %. Seven of them show a positive result and three of them a negative result. A meta-study like this will most likely be portrayed in popular science media as evidence that the positive result is real. Often presented in very simple terms like: “Three-quarters of all studies show a positive effect”. This leaves the reader with the impression that the positive result really is true. I mean, way more than half the studies show a positive result. Still, this is of course a vast simplification. For example, in the former case with only two studies we could have concluded that 100% of all studies show a positive result. But, when you actually take into account confidence levels and other statistical properties (as you showed) the result is really a much more nuanced one. I guess my question is how you would conduct a similar calculation with regard to confidence levels when we are talking about a meta-study of 10 single studies where seven show a positive result and three a negative. And perhaps more important. How different would that be from the simplistic way of simply saying that almost three-quarters of all studies show a positive result?

In reply to #8 by Nunbeliever:On the contrary, I’m afraid: most of my knowledge is actually of A-level mathematics that touched upon statistics as just one topic among several, and that was so many years ago that I’d be surprised if I could remember everything from that course. I certainly doubt it covered meta-analysis. Thus, I can’t honestly give an answer to your question without in some sense deceiving you about my expertise. The cumulative probability issue I just explained to you is small potatoes compared with the advanced techniques of meta-analysis.

Maybe someone else on this site could answer it – Jos Gibbons, maybe – but I simply don’t know how. Sorry, I should have indicated my limitations earlier.

In reply to #9 by Zeuglodon:No worries, you clearly was able to clear a few things up for me. The reason I presented this question was that I would honestly like to know whether this common simple way of presenting results of meta-studies makes any sense from a statistical point of view.

Consider a very simple example. You repeat the same study (with a confidence level of 95%) twenty times. 15 of these showed a positive result and five a negative. You could present the findings of this meta-study that three-quarter of all the studies confirm the positive result. That sounds quite impressive right? Surely, these kind of statements are quite common in popular science media. On the other hand. With regard to the confidence level we should expect only one of these studies to show a negative result, nonetheless we have five. Instead of the expected 5% margin of error, this meta-analysis seem to verify that the real margin of error for the initial study was 25%. Isn’t this a huge problem? Or perhaps I’m messing things up again 😀

In reply to #10 by Nunbeliever:I say this without any relevant educated background, in fact probably less so than Zeuglodon, and I don’t know if the nuances of each study could explain the inconsistency.

However reading your comment, it appears to me that the problem is being created in this hypothetical scenario, which you’ve then addressed by saying that the true confidence level must be lower than 95%.

A scenario in which the confidence level of a type of study is 95% doesn’t seem compatible with a scenario in which three out of the ten studies result in completely opposite findings.

Somethingin this scenario isn’t correct.It’s a bit like saying you have ten weighted dice, which are all weighted to roll a six, however when you roll, seven dice roll sixes and three of them roll ones. The weights might not make them consistently roll sixes, but a one is the least likely result. Either they didn’t actually roll ones, (the results aren’t actually the opposite) or they’re not correctly weighted. (the confidence level is far less than 95%)

In reply to #14 by Seraphor:Assuming the dice are tampered with so that the probability of 6 is higher but the probability of “not 6″ is the same (i.e., 1 is no more probable than 2) then the probability of rolling three 1’s is the same as rolling any other three “not 6’s”. I.e., on one roll you can’t say three 1s are any less probable than 3,4,5.

In reply to #15 by Red Dog:Well, I’ve never heard of a weighted die that works in that way, one is directly opposite the six, so any

othernumber would be equally likely but six would be more likely and one less likely. But that’s not really relevant to my point.In reply to #14 by Seraphor:Well, I think this could be possible if there’s for example unknown variables that gives you a false impression of causality or an impression that certain correlations are stronger than they really are. This would mean, that the studies are methodologically sound but we just don’t have enough knowledge to find all possible unknown variables that might be giving as a false sense of correlation or causality. Isn’t this a quite probably scenario with regard to medical research? I mean, often we really don’t understand what is causing a certain disease or health problem. We just observe correlations. One example (although not a perfect analogy) would be the debate with regard to cholesterol and heart disease. There is obviously a correlation, but some researchers claim there is no causal relation. Hence, I think it’s not that hard to imagine that this problem might be quite common with regard to medical research and that we might have cases where unknown variables are distorting our results.

A way to find out is of course to do meta-analysis of several similar studies. If the correlation is merely an illusion then more studies will probably reveal this. Hence my point was that, in the hypothetical example I presented, a meta-study reveals something is fishy. We get more negative results than the initial study seems to suggest, even though the studies are all methodologically sound. That should be an alarm bell right there. But, if we only present the percentage of studies that showed a positive result in comparison to those that showed a negative one we get the impression that this meta-analysis confirms the initial study. Three-quarters of all studies showed a positive result. This was of course a very simple example just to make a point. In reality most studies are not identical and there are way more factors to take into account which of course makes the process even more complex. And as such more likely to these kinds of errors? I really don’t know whether my point makes sense or not. I might very well be messing it all up. Statistics is not, as you have most likely realized, not my strongest subject 😉

In reply to #19 by Nunbeliever:Yes, which is why researchers try very hard to control for differences in those studied and match control groups as similarly as possible. To have a (decent) trial, you have to have a plausible mechanism to explain your hypothesis and you won’t make generalisations outside of the study group (even though the marketing dept. of your sponsoring pharma company might want you to).

Yes, absolutely. And that is the danger of avoiding full disclosure of all trials. To a certain extent this is mitigated by big (so statistically powerful) trials having people with a vested interest in publishing results whatever they show (e.g. university-based academic doctors who need to bring in funding and develop their professional reputation) and also by meta-analysis writers routinely asking for unpublished data to include. If (truly) negative results of a methodologically sound study are buried, a meta-analysis of the positive ones might still show that the positive studies are in some way sloppy, have high p-values or are not generalisable enough to the population to be useful (i.e. that there’s a ‘gap’ in the evidence). Whether such meta-analyses get done is a different question…

In reply to #5 by Nunbeliever:Sorry, I confused you with the word “confidence”. Suppose you have two studies of 25 control and 25 diseased patients. Each study can produce a 95% confidence interval for the effect of the drug. If you aggregate both studies and do the statistics on 50+50 patients the 95% confidence interval will get shorter.

Suppose both studies showed the drug worked but the 95% confidence interval overlapped 0 (no measurable drug effect), then it is possible that the drug doesn’t actually work, but the trials got lucky. Combining the data might produce a confidence interval that is all in the positive range, so now we have more “confidence” that the drug actually works.

Actually, it is possible confusing to even talk about confidence intervals at all. Suppose I toss a fair coin 100 times and record the result. I score 1 for a head, and 0 for a tail, and do classic mean and standard deviation statistics on them. I can draw a Gaussian distribution scaled in the range of an average score of 0 to an average score of 1. This is classic first central limit theorem stuff. Now my mate does the same. We will get Gaussian distributions about the same size. Now we combine results and do the stats. again for 200 tosses. The scaled standard deviation of the new Gaussian will be less, ie. the scaled distribution will be tighter. If I go on adding studies, or just adding to the coin tosses, the Gaussian distribution will get tighter and tighter until looks like a spike**, not a bell shaped curve. The standard deviation decreases in proportion to the square of the sample size.

The 95% confidence interval is of course just drawn from the Gaussian – clip 2.5% of the area off each tail.

Put another way, if you toss a coin 5 times you have a 1/32 chance of tossing 5 heads and reaching the edge of the graph (average score of 1.0). If you toss 10 times you only have 1/1024 chance of maximum average score of 1.0. If you toss 20 times the probability is 1/1,048,576.

In reply to #5 by Nunbeliever:I believe that you’ve confused 95% confidence interval (CI) with a p-value of 5%. As God Fearing Atheist inferred in Comment 20, a given result (assuming Gaussian distribution) is given a range over which the true result may fall e.g. percentage change in cholesterol for a given dose of statin. If the 95% CI includes 0, then the true result may be zero effect. If the 95% CI is say, 10-20% then there is more likely a positive effect but still a 2.5% chance it’s lower than 10 and a 2.5% chance it’s higher than 20. This is

separatefrom the chance that the result given is due to random chance. The 5% rule (p=0.05) is a convention – smaller is better (more certainly not due to chance). In some things e.g. things relating to risk of death or very expensive interventions, you’d probably want to know with a bit more certainty….A meta-analysis is more than just combining the confidence intervals and comparing the p-values and bodging an ‘average’. Its third part is checking (by various statistical methods) the heterogenicity of the outcomes i.e. does the effect measured

differ betweenstudies by more than would be expected by chance. In this way it may not only give you a better answer (narrower 95% CI with smaller p-value) but identify how comparable the studies were in the first place. If your studies vary wildly, you might have not controlled for a variable properly or your study design might have been inappropriate in the first place.The developed world uses modern medicine for it’s medical care (disregarding faith healers, homeopathies etc). The developed world has treatments and even cures for many ailments that are fatal and/or debilitating in the developing world. Corruption in modern medicine is minimal. How do I know this? Because modern medicine works. The evidence says so. Could be better? Of course, and it will be.

What you’re basically looking at is the difference between science and epidemiology. A scientific answer to whether a medication works or not is to discover the mechanisms by which it operates, and see if those mechanisms produce the desired effect. An epidemiological answer is to give that medicine to a group of people, then see if it looks like the desired effect emerges, without knowing how.

Because human bodies are inordinately complex, scientific approaches to medicine are very difficult to achieve. We are left with epidemiological approaches much of the time, and epidemiology is pretty much all about statistics. The more samples you have, the better your drawn conclusions are likely to be. So analyzing a number of separate studies to increase sample size beyond what any single study can accomplish is often the best way to improve the quality of your answers – both on whether or not the treatment does what it’s intended to do, and whether or not it produces undesirable results that were not intended. For discovering the latter, epidemiology is more often than not the only viable method.

Medicine can be corrupt in many more ways than covered by Goldacre (statistical manipulation and selective disclosure). However, the modern push for all medicine to be “evidence-based” provides a good measure of control. However, probably the majority of medical interventions (mostly surgical, but many pharmacological) that go unquestioned today have not passed very high statistical hurdles. For instance, whose has heard of a double-blind, placebo-controlled, randomized, multi-center trial to prove that appendectomy saves life in acute appendicitis?

In reply to #11 by catphil:Yes. I did not really think about how ethical aspects can influence medical research. But, as you point out it would be very hard to conduct a methodologically fully sound study of whether appendectomy saves life in acute appendicitis in an ethical manner. I guess, the same goes for many longitudinal health studies. It would be highly unethical to prevent a control group from getting certain forms of treatments or procedures just in order to make the study sound.

I htink it’s more a case of any weakness will be exploited. meta analysis can provide evidence that would be very expensive to trial for on its own but can create a start off point for more targeted future trials.

I agree with BGs campaign to force negative results to be published however, simply because I think it’s unethical to generate scientific data that’s not made available to the community

Pharmaceuticals are BIG business which means there are big incentives to cheat. New drugs are expensive to develop and they have to not just be effective, but be more effective than existing drugs whose patents have expired and are cheaper. There many documented cases of big drug companies fudging results to make a new, patent-able, drug look more effective than it is. This is a real problem, but the new drugs generally are effective. The part of trails and testing that gets fudged is safety.

So, be cautious with new drugs, but you an be fairly certain a drug that has been passed is actually effective.

Meta-studies are very valuable and valid method of science. The basic idea is to combine the data from many similar studies to try to get a better idea of the true value or truth of a hypothesis or theory. Carefully studying the results of 10 different studies that are all marginal can give insight into whether the effect being studied is real or not.

In reply to #6 by Zeuglodon:I don’t think you can’t multiply the confidence intervals in that way because the studies are highly correlated. Cummings cites a paper where the probability that the mean of study 2 is within the 95% confidence interval of study 1 is 83.33%. Its all a bit of a mind fuck.

In reply to #21 by God fearing Atheist:Correct. Bayes Theorem to the rescue! Almost all of medicine takes account of pre-test probability in modifying your subsequent interpretation of results (the odds of the next result in the chain). Otherwise repeatedly doing the same study with the same (genuine) 95% chance of being ‘real’ leads to the absurd situation of the probability approaching zero.

For anyone interested, here is a little primer on Meta Analysis from Bandolier, an evidence-based medicine journal out of the University of Oxford.

And I’ll be quiet now…. 😀

A reliable study costs lot of money, especially because results are far from clear, and who would sponsor research just for the sake of knowledge. But, as far as gathering of information goes, meta analysis is a good solution. Especially for rare problems.

But, of course, development of new medicines is another matter. I have translated documents of several clinical studies and, untill reading “Bad Pharma” was rather puzzled about the type of study, when each of the envolved researcher gets usually 2-5 patients. But that is Big Pharma.

Another problem is sciency studies about things one should or shouldn’t do and/or eat, to protect himself/herself from all kinds of maladies, starting from breast cancer to erectile dysfunction. And such information can be dangerous and lead, for example, to alchocol dependency due to the attempts to protect one’s heart, blood vessels, liver etc. by drinking red wine. Even though now it is known that the famous resveratrol is far from well soluble (and hardly anything is absorbed by human body), so at best it protect bacteria living in the sewage system.