one of the many ways…

Posted on Wednesday 15 April 2015

the psychologist
British Psychological Society
17th March 2015

A lively debate was held at London’s Senate House yesterday with panellists from neuroscience and psychology discussing the question: is science broken? If so, how can we fix it? The discussion covered the replication crisis along with areas of concern regarding statistics and larger, more general problems…

Neuroskeptic, a Neuroscience, Psychology and Psychiatry researcher and blogger, gave a personal perspective on problems with science, speaking of the events which led him to lose faith in the research in the field. He said that as undergraduate students people are taught to do statistics in a very particular way, but once a person begins PhD research things change vastly. After gathering some results for his PhD research, Neuroskeptic found he had one significant result out of seven tasks performed by his participants. He said: ‘I thought back to my undergraduate days and thought “what if you do a Bonferroni correction across all the tasks?”. I got the idea that I’d suggest this to my supervisor but don’t think I ever did, I realised that just wasn’t how it was done. I was very surprised by this. I learned as an undergraduate you do a Bonferroni correction if you have multiple tasks. I started to wonder if we aren’t doing this who else isn’t doing it? I began to lose faith in research in the field.’

Neuroskeptic said he wondered whether there was a good reason that multiple comparisons correction was not used. He added: ‘I still don’t think there’s a good reason we can’t do that. We have come to the tacit decision to accept methods which we would never teach undergraduates were a statistically good idea, but we decide that we’re happy to do them ourselves. That’s how I got on the road to blogging about these issues.’

Neuroskeptic is something of a blogger’s blogger, maintaining his anonymity on his personal blog for years, and now as a blogger for Discovery Magazine. He writes about a variety of topics, and they’re usually interesting whether they’re in your field or not. His nom de plume, Neuroskeptic, was a good choice. He not a "neuro-cynic," but rather a person who doesn’t believe in absolute truth just like his namesake, Pyrrho of Ellis, the founder of Skepticism in ancient Greece [as opposed to Dogmatism] [see my old Greek…]Neuroskeptic brings his skeptial attitude to everything he writes.  I linked to his blog about this topic, Is Science Broken?, in case you’re interested, but I wanted to talk about the specific example he’s using here, the Bonferroni Correction, as it relates to Clinical Trials.

My own biostatistics and research experience was in another medical field over forty years ago, so when I began to look at the math of clinical trials, it was familiar but only just. Besides coursework, my only hands-on experience was using ANOVA to partition the variance of interactions of effects, so there was  much to learn. But I do have a Bonferroni Correction story to tell from those days. During an Immunology fellowship, my clinical work was with a Rheumatology Section. Rheumatology is like Psychiatry in that there are many conditions where the etiology [cause] was and is unknown. In the 1960s, Rheumatologists were collecting large databases on every patient they saw to develop criteria for diagnoses [sound familiar?]. Databases were new, as were the mainframe computers that held the data entered with punch cards and stored on tapes. Statistics were run with home-grown Fortran programs that ran over-night [if you were lucky]. Bill Gates hadn’t yet made it to high school. Excel was something you did in sports. And correcting for multiple variables was something kind of new.

One afternoon, the statistician and clinical staff blocked out a two hour conference to show us the results from the clinical database they were collecting [with great pride]. It was one of those after-lunch conferences where the eyelids are hard to hold open. Towards the end, the statistician showed us a thick stack of computer print outs with all the significant findings – disorders across the top, parameters down the side, cells filled with probabilities. Then he said something like, "Of course we had to correct the statistics for multiple measurements." I don’t remember the term Bonferroni Correction, but I do remember what he did. He divided all those p-values by the number of things measured, and then he showed a slide of what significance remained from that thick stack of printouts. It evaporated, and left a table that fit on one readable slide. I was pretty impressed, but he seemed deflated watching his fine p-values go up in smoke.

The logic behind correcting for multiple variables is pretty sensible, and simple. If you do an experiment and measure one outcome variable, p<0.05 means there’s less than 1 in 20 odds that the result happened by chance. However, if you measure 20 outcome variables, one will come out p<0.05 by chance alone. The Bonferroni Correction is to divide each p-value by 20 [the number of outcome variables] – so you’d need a p< 0.0025 [0.05 ÷ 20] to claim the same level of significance. With 10 outcome variables, you would need p<0.005 [0.05 ÷ 10]. Piece of cake? Well Neuroskeptic is absolutely right. Many [if not most] Clinical Trials just ignore this correction altogether. Others try to explain not using it, like this from Morrison et al [Cognitive therapy for people with a schizophrenia spectrum diagnosis not taking antipsychotic medication: an exploratory trial reported in slim pickings… recently]:
"Dependent t tests were used to analyse changes in outcome measures for the normally distributed variables; non-parametric analyses using Wilcoxon’s signed ranks test were used for skewed data. Tests of significance were two-tailed, but no correction was made for multiple comparisons given that this was a feasibility study in which we were less concerned about type 1 error."
[Note: A type I error is a false positive] 
Well, we have a really impressive false positive problem, that’s for sure. The Bonferroni Correction is very tough on results – a harsh test. There have been other methods developed that are gentler, but they’re not used very much either. Another point: the method of correction, like any piece of the analysis, should be declared in the a priori protocol, and that’s rarely done. The reason is obvious. Post hoc, knowing the results, you can pick your correction method [if you even pick one] to fit how you want things to come out. So  Neuroskeptic is absolutely correct, this is an almost institutionalized problem in Clinical Trials – just one of the many ways people get control of what their data says – like correction for attrition, or study design, or choice of statistical tests, etc. It’s why Data Transparency is so vital – so you can see under the places where deceitful analysis can change things but remain hidden…
and break science…
  1.  
    Steve Lucas
    April 15, 2015 | 11:52 AM
     

    Mickey,

    Thanks again for walking us through the math.

    Steve Lucas

  2.  
    Bernard Carroll
    April 15, 2015 | 12:16 PM
     

    Gotta agree with your praise for Neuroskeptic. Rather than nom de plume, how about calling his moniker a nom de blog?

  3.  
    James O'Brien, M.D.
    April 15, 2015 | 12:52 PM
     

    Thanks for posting this. It goes to what I have been saying that this is not just a psychiatry/pharma problem but a systematic problem in most academic research. Rigor is the enemy of logrolling and a padded CV. But as I have commented previously, archangels could be doing research and it wouldn’t matter if the study is based on DSM or bad statistical methodology. Look at internal medicine and all the issues we have had to revisit in terms of the treatment of lipidemia without heart disease/stoke and mild hypertension. Or something as basic as low fat/low carb in diet.

    You can blame the journal editors as much as anybody including pharma and the KOLs.

  4.  
    April 16, 2015 | 2:00 PM
     

    Depression-era ‘Superstitions in Medicine’ mural by Bernard Zakheim

    http://dahsm.ucsf.edu/superstitions-in-medicine-mural-by-bernard-zakheim-at-cole-hall/

  5.  
    April 17, 2015 | 4:29 PM
     

    Great post!

    I agree with James O’Brien’s point that “this is not just a psychiatry/pharma problem but a systematic problem in most academic research.”

    On the other hand, the problem is probably worse in the case of pharma research because the profit motive is added to the universal tendency of scientists (industry or academic) to want to confirm their hypothesis rather than refute it.

  6.  
    Catalyzt
    April 18, 2015 | 1:19 AM
     

    Financial incentive would certainly seem to supercharge confirmation bias, but I’m still kind of reeling from the implication that some disciplines are going to be more vulnerable, and the comparison between rheumatology and psychiatry.

    It’s an odd coincidence that I saw my rheumatologist yesterday– he was staring at the EMR and shaking his head, saying, “Why do you get blood clots?” I pointed out that there is sort of a vague association between a positive anti-RNP (which I have) and blood clots, but that clearly didn’t satisfy him. (All the other antibody tests are negative, anticardiolipin, double-stranded DNA, blah, blah, blah.)

    He seemed… well, almost angry, though not at me. And I thought, how frustrating it must be to practice in a specialty where the etiology is never completely clear. Who can live with that level of ambiguity, and for how long? At a certain point, the temptation to fixate an answer even if you didn’t have one would probably become a hazard. All of us probably know a psychiatrist who just gave up at a certain point, diagnosed most of his patients with the same thing, and prescribed most of them the same meds.

    What do you guys think of Perneger’s idea that the Bonferonni correction is misleading, particularly if the multiple outcomes you’re measuring aren’t closely related? I think that’s what he’s saying, anyway. The link below is from some article he wrote in 1998. I don’t know who he is, but at first blink, he doesn’t seem to be shilling for anything.

    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1112991/

    It really sucks getting interested in all this when I have virtually no natural aptitude for math. *sigh*

Sorry, the comment form is closed at this time.