study 329 iv – some challenges…

Posted on Friday 11 September 2015

The RIAT Initiative was a bright idea. Rather than simply decrying unpublished or questionable Clinical Trials, it offers the original authors/sponsors the opportunity to set things right. If they decline, the RIAT Team will attempt to do it for them with a republication. Success depends on having access to the raw trial data and on having it accepted by a peer reviewed journal [see “a bold remedy”…]. Both the BMJ and PLoS had responded to the RIAT article by saying they would consider RIAT articles. Paxil Study 329 had certainly been proven "questionable" in the literature and in the courts. And most of the data was already in the public domain thanks to previous legal actions. So a group of us who had independently studied this study assembled to begin working on breathing life into the RIAT concept. Dr. Jon Jureidini and his Healthy Skepticism group in Australia had mounted the original [and many subsequent] challenges to this article. He was joined there by colleagues Melissa Raven and Catalin Tofanaru. Dr. David Healy, well known author and SSRI expert was joined in Wales by Joanna Le Noury. Elia Abi-Jaoude in Toronto and yours truly in the hills of Georgia, USA rounded out the group. I was certainly honored to be included. While all of us have some institutional affiliation, this project was undertaken as an unsupported and unfunded enterprise without connection to any institution. Though my own psychiatric career was primarily as a psychotherapist, in a former incarnation, I was a hard science type with both bench and statistical training. So I gravitated to the efficacy reanalysis, and that’s the part I’ll mention here and in some remarks after the paper is published.

 

The Full Study Report Acute was a 528 page document that addressed the 8 week acute phase of Paxil Study 329. The actual raw data was in additional Appendices. On the first pass through this document, we considered a number of approaches to presenting the data. In recent years, there has been a move away from the traditional statistical analysis towards also considering the Effect Sizes. Statistics only tell us that groups are different, but nothing about the magnitude of that difference. Effect Sizes approximate the strength of that difference and have found wide acceptance particularly in meta-analyses like those produced by the Cochrane Collaboration. But in the end, we decided that our article was more than simply about Study 329, we wanted it to represent how such a study should be properly presented. And since every Clinical Trial starts with an a priori protocol that outlines how the analysis should proceed, we decided, wherever possible, to follow the original protocol’s directives.

Looking over the protocol, it was comprehensive. We found two things that were awry. First, the comparator group was to take Imipramine, and the dose was too high for adolescents –  1.5 times the dose used in the Paxil trials for adults. That was apparent in the high incidence of side effects in that group in the study. The second thing was a remarkable absence. There was no provision for correcting for multiple variables to avoid false positives. The more variables you look at, the more likely you are to find a significant correlation by chance alone. There are many different correction schemes from the stiff Bonferroni correction to a number of more forgiving schemes. This study had two primary and six secondary efficacy variables. The protocol should have specified some method for correction, but it didn’t even mention the topic. Otherwise, the protocol passed muster. It was written well before the study began and it was clear about the statistical methods to be used on completion to pass judgement on efficacy. One other question came from the protocol, how were we going to deal with missing values. The protocol defined all of the outcome variables in terms of LOCF [last observation carried forward]. In the intervening 14 years, LOCF has largely been replaced by other methods: MMRM [Mixed Model, Repeat Measurements] and Multiple Imputation. We used the protocol directed LOCF method, but at the request of reviewers and editors, we also show the Multiple Imputation analysis for comparison.

 

I guess the only other thing to say before the paper is published is that this was quite an undertaking. There were no precedents for any aspect of this effort. I’ve mentioned just a few of the decisions we had to make along the path, but every one of them and many others are the result of a seemingly endless stream of email and drop-box communications that regularly sped around the globe. There’s no part of this paper that doesn’t have the collective input of most of the authors. There were no technicians, statisticians, or support staff involved so we drew our own graphs, built our own tables, ran our own numbers, and checked and revised each others work. As with any new thing, looking back over it, it’s easy to see how it could have been a much more streamlined process. But that’s only apparent looking through a retrospectascope. Somewhere down the line, I hope we’ll have the energy to pass on some of the many things we learned along the way to help future RIATers have an easier passage.

So in the near future, there are going to be two studies in the medical literature that reach opposite conclusions but are derived from the self-same Clinical Trial and its data. I don’t know if there’s another instance where that’s the case. After it’s published, I want to add a short series of blog posts to describe how that came about. The goals of the paper are to set the record straight and to model how a report of a Clinical Trial should be presented. But in later blog posts, I want to add a discussion of how the original analysis was twisted to make this negative study into something that was published as positive. And I hope that future RIAT restorations will do the same. The more we learn about exactly how scientific articles can be jury-rigged to reach questionable conclusions, the closer we’ll be to expunging the widespread bias that has invaded our medical literature for much too long. In the final analysis, the ultimate goal is for physicians and patients alike to have access to a scientific medical literature that can be trusted to be accurate. After all, it’s ours…
Mickey @ 8:00 AM

study 329 iii – the path to the data…

Posted on Thursday 10 September 2015


by Keller MB, Ryan ND, Strober M, Klein RG, Kutcher SP, Birmaher B, Hagino OR, Koplewicz H, Carlson GA, Clarke GN, Emslie GJ, Feinberg D, Geller B, Kusumakar V, Papatheodorou G, Sack WH, Sweeney M, Wagner KD, Weller EB, Winters NC, Oakes R, and McCafferty JP.
Journal of the American Academy of Child and Adolescent Psychiatry, 2001, 40[7]:762–772.

Objective: To compare paroxetine with placebo and imipramine with placebo for the treatment of adolescent depression.
Conclusions: Paroxetine is generally well tolerated and effective for major depression in adolescents.

Not long after Jon Jureidini and Anne Tonkin of Healthy Skepticism questioned these results in a 2003 letter to the editor, Elliot Spitzer, then Attorney General of New York State filed a complaint in 2004 alleging fraud. GSK settled for $2.5M with an agreement to post the data from their pediatric studies of Paxil® on a public Internet Clinical Trials Registry, but admitted no wrongdoing. This would be a good place to review the various packages referred to under the heading data:

  • PROTOCOL and SAP[Statistical Analysis Plan]: We talked about these documents in the last post – detailed ‘maps’ of how the study is to be conducted and analyzed.
  • CSR: The CLINICAL STUDY REPORT is an elaborate narrative write-up of the study, in this case, it’s filled with tables and graphs. It tells the story of the clinical trial in detail. And since this trial had two phases, there are two: Full study report acute [528 pages] and Full Study report continuation [264 pages]. In this case, the raw data [Appendices] was not released initially.
  • ARTICLE: This is the published article in a journal, abstracted above.
The CSRs are what GSK had posted on their Internet Clinical Trial Registry in response to settling the suit in New York in 2004, and that’s how things remained until 2012. In August 2012, I was visiting the GSK Clinical Trial Registry for some now-long-forgotten reason, and was amazed with what I found there [see a movement…]. Here’s the visual:
 
On the left is what it had always looked like before, and on the right was how it looked on that visit. And when I opened the new files, they were filled with tables and tables of raw data – the scores for every subject on every rating scale, tables filled with the logged side effects. It had been added in the previous few weeks. Was it Christmas morning?  I started asking around, and David Healy responded. It seems that Peter Doshi, the researcher working on getting the raw trial data from Roche on Tamiflu® was extending his reach. Noting that GSK had never really posted the raw data from Study 329, he contacted the current New York Attorney General and GSK finally posted all those Appendices that contained the results I’d just stumbled across. So now we can add yet another package under the heading data:

  • IPD: The INDIVIDUAL PARTICIPANT DATA is, in this case, 150 Megabytes of raw scores and other tabulations, increasing the mass of available information 100 fold! And bringing most of the trial out of the shadows.
I jumped on this new information and did my rough analysis [see the lesson of Study 329: an unfinished symphony…], naively thinking I could just send it right over to the journal [JAACAPJournal of the American Academy of Child and Adolescent Psychiatry] and they’d finally retract the article. No such luck [see simply ‘fuel the fire’…]. But I was in good company. Healthy Skepticism had appealed to everyone this side of the Vatican with the same frustrating responses.

When the RIAT Initiative [restoring invisible and abandoned trials] was launched in the summer of 2013, Study 329 was a prime candidate as so much of the data was already available, and there was no question that it had been abandoned. Not long after our team had formed, GSK announced that it was establishing a data portal, available to qualified groups who wanted to access the data from a previous trial [after 2007] for some further research project – generally known as Data Sharing. Access was contingent on being accepted by an independent panel of judges. Study 329 was conducted from 1994 until 1998 and published in 2001. We did not want to do a new research project. Instead, we wanted to reanalyze the raw data and potentially republish the study with a new analysis. And, as the figure above shows, we already had access to most of the information anyway. What remained?

When one does a Clinical Trial, there’s some form to fill out for every single interaction [emphasis on every] with the subject’s ID number and the date [the treatment is obviously not there in a blinded study]. By the end of things, each subject has amassed literally volumes of forms [the actual number depends on how long they stay around, how many adverse events they report, etc]. They’re called Case Report Forms [CRF], and there are plenty of them [50,000+ in Study 329]. The IPD [Individual Participant Data] is created by transcribing the CRFs into a tabularized [and more manageable] format. Why did we want them? We were specifically interested in checking the transcription of the Adverse Events from these raw forms into the tables we already had. The CRFs are the data [or at least as close to the real data as one can get].

GSK had not offered access to the CRFs as part of their Data Sharing program; the study was well before their 2007 offer; we didn’t have a new research proposal [other than the original Study 329 protocol]. On the other hand, there was that 2004 settlement in New York in which they had agreed to make the data from their pediatric Paxil® trials available. While it’s a little bit like selling you a Bible that has only Genesis and Revelations included, for the moment I’m going to forego all the negotiations in-between [see Peter Doshi’s Putting GlaxoSmithKline to the test over paroxetine]. By the beginning of 2014, we had been given access to the electronic version of the IPD and most of the CRFs [anonymized] via the remote data portal we called "the periscope" [another story for another time]:

  • CRF: The CASE REPORT FORMS are all of the forms filled out in the study along the way. They’re the snapshots by the people in direct contact with the subjects – the closest proxy to "being there."
So, in the end, we had it all. Earlier, I said "But be careful what you ask for, because once you get it, it’s a long and winding road to know quite what to do with it." Actually, it was a "long and winding road" just to get it…
Mickey @ 8:00 AM

study 329 ii – the importance of protocol…

Posted on Wednesday 9 September 2015

I sure don’t want to become 1·terminally·boring·old·man. On the other hand, this is my only available format for communicating. I want to write about the process of evaluating Trials anyway, but I also have a practical reason. A lot of us have clamored for access to the raw data from Clinical Trials, realizing that a lot of the published journal articles are riddled with subtle distortions in both the efficacy and harms analyses, particularly in psychiatry. We intuitively know that if the raw data had been available to us all along, things would be a lot different, and a lot better. But be careful what you ask for, because once you get it, it’s a long and winding road to know quite what to do with it.

There are thousands of pages in various packages generated by every Clinical Trial. So processing it all is no small task – finding those trees that matter in the forest. One thing for sure – an absolutely essential element for understanding any Clinical Trial is the a priori protocol. If you’ve done any research at all, you know that once you’ve got some data in your hands, there are a bunch of different ways to analyze it. The saying, "If you torture the data long enough, it will tell you anything you want to hear" becomes very real in practice. In any circumstances, there’s a strong temptation to try out various analytic techniques to see if the outcome doesn’t look more like you’d hoped. But in the case of a drug trial, there’s already a lot of time and significant treasure invested, meaning that the Clinical Trial results are the difference between throwing it all away or landing on a gold mine. The temptation to do some creative data analyzing is magnified exponentially in such a high stakes game. So it’s an absolute requirement that the outcome variables and the precise analytic methods are clearly stated before the study begins.

In evaluating a journal article reporting a Clinical Trial, the a priori protocol is an invaluable tool, and the first window to pry open. In the case of Study 329, the Protocol and the SAP [Statistical Analysis Plan] were together as a single document:

With the published article in hand [left], the trial itself is only a shadow. You can’t really know if the article is presenting the trial as declared before it started, or if it has been manipulated in one way or another. With the a priori protocol [right], you can evaluate the study design itself [for bias, omissions, etc] as well as compare it to the article to look for changes. So once recruitment begins, there shouldn’t have been any substantive changes in the protocol. Even minor alterations should have been added as official amendments to the protocol [and approved by the Institutional Review Board]. That point can’t be emphasized enough.

It may seem downright anal to insist on following the original protocol to the absolute letter. After all, people who do Clinical Trials call themselves researchers, and isn’t research supposed to be a creative endeavor? Certainly, the researcher can do any analysis he wants to do on the data. But an industry funded Clinical Trial is at the core, something else besides research – it’s Product Testing [creativity not invited]. One has to assume that any deviations after the study is underway are potential attempts to bias the outcome. The acronym HARK [Hypothesis After Results Known] reminds us of this danger. Non-protocol analyses or outcome variables are called exploratory, and may be very revealing, may even be discussed in the narrative. But they’re off limits in formulating the definitive conclusion of the study. If they’re that tempting, do another Clinical Trial with those findings in the new a priori protocol.

I was a late-comer to Study 329. By the time I got involved, it already had a literature of its own from the subpoenaed documents and settled court cases. I used a lot of that in a previous series that starts with a movement… and continues for quite a while, giving something of a  historical perspective [catalogued in the lesson of Study 329: an unfinished symphony…]. It’s there for the reading so I won’t repeat all of that here. When I wrote it, I’d been looking at RCTs for a while. But re-reading that series now, I can see how naive I was about the details – a novice about how Clinical Trials actually work, how they can be distorted. I suspect I wasn’t alone in my ignorance. I’ve learned a lot being involved in our current project, and so my focus is going to be different. Last time through, I was interested in proving to myself [and maybe you] that the analysis presented in the published paper was flawed, and did not show that Paxil® was either efficacious or safe in depressed adolescents. After this two year stint, I’ve learned a lot more about how to actually vet a Clinical Trial when you have the kind of Data Transparency we all want to be coming in the near future for all of them – what’s important and how to go through it. I hope this partial narrative of that journey will:

  • encourage other RIAT teams to look into unpublished or questionable Clinical Trials
  • help make future enterprises less grueling
  • make a contribution to future reforms in the current system
and it all starts with the a priori protocol

    a pri·o·ri  [ä′ prë-ôr′ë]
    adj.

    1. from a general law to a particular instance;
      valid independently of observation.
    2. existing in the mind independent of experience.
    3. conceived beforehand.
Mickey @ 8:00 AM

study 329 i – setting things right…

Posted on Tuesday 8 September 2015

"Will this drug help me?" "… hurt me?" "… do nothing?" "What if I don’t take it?" Questions asked as if there’s an answer. But in every case the answer is in the form of likelihoods, not certainties. Each question has "how much?" tacked on – "how much might it help me?" "… hurt me?" Sometimes the answer depends on who you are – male/female? black/white? young/old? One can go on and on with things that might affect the answer, and still only end up with a risk benefit estimate, not the simple answer you want to hear. With interventions that have been around for a while, the doctor and sometimes even the patients have the benefit of long usage that makes things a lot easier. But every new treatment has a beginning with no clinical experience to fall back on. What then? So to the Clinical Trials.

In the laboratory, you can take two groups of genetically identical animals living under the same conditions and give one group a medication and the other something inert, then compare the results. We humans are much harder. We’re not genetic clones. We live in a wide variety of ways and places. We’re a fickle lot – sometimes we don’t consistently take the medication; sometimes we miss appointments; sometimes we drop out of studies altogether. And then there’s this placebo effect thing. For reasons known and unknown, just being in a study itself often makes us significantly better, a particularly common finding in psychiatric drug trials. Thus any Clinical Trial comes out of the gate with built in variabilities and confounding factors, no matter what is being tested. Then there’s time. Most trials are short compared to the projected drug use. So a Clinical Trial is only a rough starting point at best, picking up on harms and judging efficacies in a closely attended but brief setting – an abnormal setting.

I never paid too much attention to Clinical Trials. I think I even thought the FDA did them [that’s how little attention I paid!]. New drugs that mattered didn’t come along that often, and I [we] learned about them from other sources. I remember coming into Psychiatry from Internal Medicine, and being awed by how much we talked about them. There were only a few classes with not that many member drugs in each class. That was nothing, compared to where I came from. But "Why did you pick this-azine instead of that-azine?" was a frequent question, so I got into the swing of things and learned the standard lines about what seemed minor differences. By the time the "new" drugs showed up [SSRIs, SRNIs, Atypicals, Mood Stabilizers(?), etc], I was a practicing psychotherapist and mostly kept up out of habit, with lite usage. So Randomized Clinical Trials and drug approval processes are an acquired interest.

Actually, people like me who didn’t pay attention were part of our current problems. We relied on the academic community to keep us up to date about medications with journal articles and review articles. We took our required Continuing Medical Education [C.M.E.] Courses and went to our professional organizations’ meetings. And we didn’t really notice that things were slowly changing, that the firewall between the commercial medical enterprises and academic medicine had eroded a little more with every passing year. Medicine is traditionally a self-regulating profession, and we fell down on the job. So now we’re in the position of trying to reclaim things we just took for granted, and the commercially funded Randomized Clinical Trial [RTC] is right in the center of the problem.

I’ve had the opportunity to be on a RIAT team that has spent a couple of years immersed in looking back on a single Randomized Clinical Trial [RCT] that began recruitment 21 years ago, was published 14 years ago, and is now a classic – not as a breakthrough, but rather becoming a paradigm for what needs our attention. It was the SmithKline Beecham trial of Paxil® in depressed adolescents known as Study 329. Our reanalysis will be published shortly [see A Milestone in the Battle for Truth in Drug Safety]:
by MARTIN B. KELLER, M.D., NEAL D. RYAN, M.D., MICHAEL STROBER, PH.D., RACHEL G. KLEIN, PH.D., STAN P. KUTCHER, M.D., BORIS BIRMAHER, M.D., OWEN R. HAGINO, M.D., HAROLD KOPLEWICZ, M.D., GABRIELLE A. CARLSON, M.D., GREGORY N. CLARKE, PH.D., GRAHAM J. EMSLIE, M.D., DAVID FEINBERG, M.D., BARBARA GELLER, M.D., VIVEK KUSUMAKAR, M.D., GEORGE PAPATHEODOROU, M.D., WILLIAM H. SACK, M.D., MICHAEL SWEENEY, PH.D., KAREN DINEEN WAGNER, M.D., PH.D., ELIZABETH B. WELLER, M.D., NANCY C. WINTERS, M.D., ROSEMARY OAKES, M.S., AND JAMES P. MCCAFFERTY, B.S.
Journal of the American Academy of Child and Adolescent Psychiatry, 2001, 40(7):762–772.

Objective: To compare paroxetine with placebo and imipramine with placebo for the treatment of adolescent depression.
Method: After a 7 to 14-day screening period, 275 adolescents with major depression began 8 weeks of double-blind paroxetine [20–40 mg], imipramine [gradual upward titration to 200–300 mg], or placebo. The two primary outcome measures were endpoint response [Hamilton Rating Scale for Depression [HAM-D] score <8 or >50% reduction in baseline HAM-D] and change from baseline HAM-D score. Other depression-related variables were [1] HAM-D depressed mood item; [2] depression item of the Schedule for Affective Disorders and Schizophrenia for Adolescents-Lifetime version [K-SADS-L]; [3] Clinical Global Impression [CGI] improvement scores of 1 or 2; [4] nine-item depression subscale of K-SADS-L; and [5] mean CGI improvement scores.
Results: Paroxetine demonstrated significantly greater improvement compared with placebo in HAM-D total score <8, HAM-D depressed mood item, K-SADS-L depressed mood item, and CGI score of 1 or 2. The response to imipramine was not significantly different from placebo for any measure. Neither paroxetine nor imipramine differed significantly from placebo on parent- or self-rating measures. Withdrawal rates for adverse effects were 9.7% and 6.9% for paroxetine and placebo, respectively. Of 31.5% of subjects stopping imipramine therapy because of adverse effects, nearly one third did so because of adverse cardiovascular effects.
Conclusions: Paroxetine is generally well tolerated and effective for major depression in adolescents.
When I had my first crack at this article three years ago [the lesson of Study 329: an unfinished symphony…], I didn’t know as much as I thought I did. What I’ve learned in the interim are some things about how the Clinical Trial systems work in practice, how Clinical Trial data are recorded and catalogued, how to analyze the data, and just how easy it was to turn a negative trial into a paper that was accepted by the The Journal of the American Academy of Child and Adolescent Psychiatry where it remains, a stubborn reminder of a bygone era like the rebel battle flag that flew at the capital of South Carolina until recently…
Mickey @ 7:30 AM

anything goes…

Posted on Monday 7 September 2015

"There is only one difference between a bad economist and a good one: the bad economist confines himself to the visible effect; the good economist takes into account both the effect that can be seen and those effects that must be foreseen…"
                French Journalist/Economist Frédéric Bastiat

The Law of Unintended Consequences is an all too frequent force in the best laid plans of mice and men – derailing the most well meant reforms. It’s like an unseen ghost, lurking behind every tree just waiting to pop out when you least expect it. At the risk of a mundane example, the Kudzu on the sides of some of our Southern byways was actively imported by TVA in its early days to control erosion, but lingers in perpetuity choking everything in its path.

Back in 2001 when Paxil Study 329 was first published, it was a genuine shock to discover that it was ghost-written by professional medical writer, Sally Laden, who created the first draft of that article and oversaw the the subsequent revisions [and other paperwork]:
Children’s Hospital of Philadelphia [Dr. Weller]; North America Medical Affairs, GlaxoSmithKIine, Collegeville, PA [Ms. Oakes, Mr. McCafferty]. This study was supported by a grant from GlaxoSmithKline, Collegeville. PA. Children’s Hospital of Philadelphia [Dr. Weller]; North America Medical Affairs, ClaxoSmithKIine, Collegeville, PA [Ms. Oakes, Mr. McCafferty]. This study was supported by a grant from GlaxoSmithKline, Collegeville. PA. The authors acknowledge the contributions of the following individuals; Jill M. Abbott, Ellen Basian Ph.D.. Carolyn Boulos, M.D., Elyse Dubo, M.D., Mary A. Fristad. Ph.D., Joan Hebeler. M.D.. Kevin Kelly. Ph.D.. Sharon Reiter. M.D.. and Ronald A. Weller. M.D. Editorial assistance was provided by Sally K. Laden. MS.
By then, the funding source and the authors who were company employees were also being regularly acknowledged, though the COIs of the authors weren’t mentioned; ClinicalTrials.gov for trial registration was in its infancy; and Propublica/Sunshine Act declarations were just a dream in the minds of a few:
    In olden days, a glimpse of stocking
    Was looked on as something shocking.
    Now heaven knows,
    Anything goes…
    Good authors too who once knew better words
    Now only use four-letter words
    Writing prose.
    Anything goes…
                      Cole Porter 1934
But have the required declarations made a difference? I’m sure they have made some difference, but like the statistical differences in many of our Clinical Trials, is it enough of a difference to really matter? Has it had the desired effect? I’m actually beginning to think that the old Law of Unintended Consequences is operating here, and that the insistance on declaring conflicts of interest may have had a paradoxical effect and increased our tolerance for Conflicts of Interest and industry involvement in scientific/academic matters. It’s a hypothesis I don’t care much for. For example, the recent Clinical Trials of the late-coming Atypical Antipsychotic, Brexpiprazole [the spice must flow…, how many stars?…]:
by Correll CU, Skuban A, Ouyang J, Hobart M, Pfister S, McQuade RD, Nyilas M, Carson WH, Sanchez R, and Eriksson H.
American Journal of Psychiatry. 2015 172[9]:820-821.
From the Zucker Hillside Hospital, Glen Oaks, N.Y.; Otsuka Pharmaceutical Development & Commercialization, Princeton, N.J.; and H.  Lundbeck A/S, Valby, Copenhagen, Denmark.

Funded by Otsuka Pharmaceutical Development & Commercialization, Inc., and H. Lundbeck A/S. Jennifer Stewart, M.Sc. [QXV Communications, Macclesfield, U.K.] provided writing support that was funded by Otsuka Pharmaceutical Development & Commercialization, Inc., and H. Lundbeck A/S.

Dr. Correll has been a consultant and/or advisor to or has received honoraria from Actelion, Alexza, American Academy of Child and Adolescent Psychiatry, Bristol-Myers Squibb, Cephalon, Eli Lilly, Genentech, Gerson Lehrman Group, IntraCellular Therapies, Lundbeck, Medavante, Medscape, Merck, National Institute of Mental Health, Janssen/J&J, Otsuka, Pfizer, ProPhase, Roche, Sunovion, Takeda, Teva, and Vanda; he has received grant support from Bristol-Myers Squibb, Feinstein Institute for Medical Research, Janssen/J&J, National Institute of Mental Health, NARSAD, and Otsuka; and he has been a Data Safety Monitoring Board member for Cephalon, Eli Lilly, Janssen, Lundbeck, Pfizer, Takeda, and Teva. Drs. Skuban, Ouyang, Hobart, McQuade, Nyilas, Carson, and Sanchez and Ms. Pfister are employees of Otsuka Pharmaceutical Development & Commercialization, Inc. Dr. Eriksson is an employee of, and owns stock in, H.  Lundbeck A/S.
by Kane JM, Skuban, Ouyang, Hobart, Pfister, McQuade, Nyilas, Carson, Sanchez, and Eriksson.
Schizophrenia Research. 2015 164[1-3]:127-35.
Contributors: Drs Kane, Skuban, Youakim, Hobart, Pfister, McQuade, Nyilas, Carson and Sanchez designed the study and wrote the protocol. Drs Kane, Skuban, McQuade and Eriksson contributed to interpretation of the data, and Dr Ouyang performed the statistical analysis. All authors contributed to and have approved the final manuscript. Ruth Steer, PhD, [QXV Communications, Macclesfield, UK] provided writing support, which was funded by Otsuka Pharmaceutical Development & Commercialization, Inc. [Princeton, USA] and H. Lundbeck A/S [Valby, Denmark].
Conflict of interest Dr Kane has been a consultant for Amgen, Alkermes, Bristol-Meyers Squibb, Eli Lilly, EnVivo Pharmaceuticals [Forum] Genentech, H. Lundbeck. Intracellular Therapeutics, Janssen Pharmaceutica, Johnson and Johnson, Merck, Novartis, Otsuka, Pierre Fabre, Proteus, Reviva, Roche and Sunovion. Dr Kane has been on the Speakers Bureaus for Bristol-Meyers Squibb, Eli Lilly, Janssen, Genentech and Otsuka, and is a shareholder in MedAvante, Inc. Drs Skuban, Ouyang, Hobart, Pfister, McQuade, Nyilas, Carson and Sanchez are employees of Otsuka Pharmaceutical Development & Commercialization, Inc. Dr Eriksson is an employee of H. Lundbeck A/S.
And, as I mentioned in the spice must flow…, there is only one academic author for each article, and both authors are at the Feinstein Institute for Medical Research. Both articles say:
From the Zucker Hillside Hospital, Glen Oaks, N.Y.; Otsuka Pharmaceutical Development & Commercialization, Princeton, N.J.; and H. Lundbeck A/S, Valby, Copenhagen, Denmark.
So like Cole Porter said:
    Now heaven knows,
    Anything goes…
I think if we had seen this much openly declared industry imprint back in 2001 [the days of Study 329], there would have been a loud general outcry [rather than just this complaint on my little blog here on the edge of the galaxy]. These articles are openly industry productions with all but two authors employed directly by industry. Both studies used 60 [!] sites [for rapidity] all over the world. They’re both ghost-written and the sole academic authors are from the same department and themselves heavily loaded with COI. We should be up in arms that two first-line journals published such obviously tainted articles. But unless I missed it, nobody has had much to say. So, as to that Law of Unintended Consequences, I’m wondering if our insistence in demanding these disclosures hasn’t sent the message that this kind of publication is fine. And that what was intended to be a check on tainted Clinical Trials has turned into a tolerance – a permission to publish them in this form. It damn sure hasn’t put a stop to them…
Mickey @ 8:21 PM

the growing cry…

Posted on Saturday 5 September 2015

    em·bar·go
    noun

    1. an official ban on trade or other commercial activity with a particular country.
      "an embargo on grain sales"
    verb

    1. impose an official ban on (trade or a country or commodity).
      "the country has been virtually embargoed by most of the noncommunist world"
    2. seize (a ship or goods) for state service.
The video in the last post [background music…] is of a talk Dr. Healy gave exactly a year ago when we first submitted our RIAT article about Paxil Study 329 to the British Medical Journal [BMJ]. If you watched it, you know that it’s a historical review of the Clinical Trial and the article that appeared in the Journal of the American Academy of Child and Adolescent Psychiatry [JAACAP] in 2001. Notice that he doesn’t talk directly about what we said in our article which is a second look at the data from that study. That’s because like most academic journals, the BMJ requests an embargo on discussing an article submitted for publication until it is either rejected or actually published. I hasten to add that this kind of embargo makes perfect sense to me for any number of reasons. So I’m not complaining.

But even though I understand and even approve of the embargo, that doesn’t mean that I enjoy waiting for publication to talk about our paper. I’ve been thinking about that Keller et al article for five years now, and actively working on our RIAT paper for two years. So it’s hard to think about much else these last several weeks [as in my recent posts are monotonous, mostly about RCTs]. But I’ll have to admit that the wait has had something of a positive effect in that it has focused my thinking onto an important topic. You guessed it – the topic is embargos – and specifically on the pharmaceutical industry’s embargo on the primary data from their Clinical Trials.

Since I was a late·comer to the ways of Clinical Trials, it took me a while to catch up. After looking at more RCTs than I’d like to admit, I realized that they were a big problem. And I was frustrated that I couldn’t get at the actual data, but I assumed that industry’s embargo was something that was backed up by some Law or Act. But that wasn’t slightly right [repeal the proprietary data act…, except where necessary to protect the public…]. They keep the data from RCTs secret because they want to, not because they have a legal right to. It’s an embargo like our embargo on trade with Cuba or Iran, a power play designed to force a desired outcome. What desired outcome? To make us accept the version presented in some published [often deceitfully written] article in a journal. And now that I’m thinking about it, I’m not sure that embargo is the right maritime metaphor for their keeping the actual data secret. Maybe …
    block·ade
    noun

    1. an act or means of sealing off a place to prevent goods or people from entering or leaving.
      "there was a blockade of humanitarian aid"
    verb

    1. seal off (a place) to prevent goods or people from entering or leaving.
      "Blackbeard blockaded the Charleston Harbor"
Blackbeard's blockade of the Charleston Harbor
… would be a more accurate choice of terms. And what makes it worse, the regulatory agencies [FDA, EMA] have been enforcers of the blockade that keeps us from being able to examine the data for ourselves. Medicine is traditionally self-regulating. How can we do that if we can’t see the data? And so our article is about more than bringing the data from one Clinical Trial out into the daylight. It’s an example of what can be learned in general from the examination of the raw data when conducted by people who don’t work for the company [that’s us] – who don’t have the kinds of conflicts of interest that are ubiquitous in these Industry funded RCTs [that’s us too]. The goal is, of course, to add our voices to the growing cry to make the actual raw data available for every Clinical Trial…
Mickey @ 2:26 PM

background music…

Posted on Friday 4 September 2015

a little background music from David Healy…
Mickey @ 8:00 AM

how many stars?…

Posted on Thursday 3 September 2015

    pen·ance
    noun: penance; plural noun: penances

    1. voluntary self-punishment inflicted as an outward expression of repentance for having done wrong.
      "he had done public penance for those hasty words"
    2. a Christian sacrament in which a member of the Church confesses sins to a priest and is given absolution.
At the first of the month, I usually scan the major journals. I think I do it as penance for having stopped reading them altogether in the latter part of my practice years. I have the likely irrational idea that if I and the rest of psychiatry had insisted that our journals maintained some kind of standard, maybe we could have short-circuited some of what happened over the last couple of decades.

This month, the second article was an editorial about a new Atypical Antipsychotic, Brexpiprazole. I’d read about half of it when I realized I already knew it, because I’d written about it when the article in this issue was published in advance on-line back in April [the spice must flow…]. I’d forgotten the name. This editorial ends with:
…In summary, as we learned from the CATIE study, we will not know with any certainty how a new antipsychotic compares with other agents currently on the market until more comparative data are available, ideally from head-to-head randomized trials. This information is important not only to justify the higher cost compared with generic agents but also to guide shared decision making. It is an appealing notion that a clinician could select from a collection of D2 partial agonists with a range of intrinsic activities the agent most suitable for an individual patient on the basis of sensitivity to side effects and efficacy requirements. Is a patient likely to experience a Goldilocks response and find brexpiprazole to be “just right” after disappointing experiences with aripiprazole and D2 antagonists? Possibly, but it’s not clear that the space that brexpiprazole occupies between aripiprazole and D2 antagonist antipsychotics is wide enough to be clinically relevant; whether brexpiprazole will differentiate itself from other agents in head-to-head comparisons or in clinical practice remains to be seen.
Before I looked at the article about the Clinical Trial [that I’d  already reported on], I tried to recall what I’d thought when I wrote about it earlier. I remembered that it was a ghost-written, industry-funded RCT with only one academic KOL author and a slew of company authors. I recalled that it was conducted over 60 sites and that the drug itself was an Abilify clone:
I remembered nothing about the drug itself:
by Correll CU, Skuban A, Ouyang J, Hobart M, Pfister S, McQuade RD, Nyilas M, Carson WH, Sanchez R, and Eriksson H.
American Journal of Psychiatry. 2015 172[9]:820-821.

That post [the spice must flow…] mentioned another companion on-line article that has since been published as well:
by Kane JM, Skuban, Ouyang, Hobart, Pfister, McQuade, Nyilas, Carson, Sanchez, and Eriksson.
Schizophrenia Research. 2015 164[1-3]:127-35.

I had found some other articles [prelapse: prequel 1…, prelapse: prequel 2…, and prelapse: but there’s more…]. They were about the Long Acting Injectable versions of the Atypical Antipsychotics coming onto the market now that the drugs themselves have gone off-patent – one of which was an Abilify Injectable.


It’s not working – my doing penance. I’ve been doing the scan-the-journals thing for five or six years, and it hasn’t changed anything. These are just drug company advertisements, the roll out for an Abilify patent-extending campaign. The poor editorial writer could hardly find anything to say about Brexpiprazole. I can hardly think of anything to say about Brexpiprazole. These articles will make good handouts for the drug reps as they go from place to place trying to convince practitioners to prescribe Brexpiprazole [in patent] instead of Abilify [now in generics]. And "yes", the RCTs are lined up for the full complement of indications [notice where the emphasis is]:


I guess there are some things that have changed. They don’t hire on as many KOLs as they used to – just one per article. The ghost writers still haven’t made it to the author byline, but we immediately know that they’re there. And, for that matter, we didn’t used to even know it was industry funded. It was pretty easy to find the Effect Sizes and compare Brexpiprazole to the other Atypicals. I guess that’s progress of sorts. But it doesn’t change the fact that these are advertisements in a first-line psychiatric journal. 

So I needed a new counter-top Microwave Oven. I moved on to amazon.com and used the "stars" from other shoppers to make my selection. Maybe the American Journal of Psychiatry could institute something like that real soon. It would be easier than making somebody have to think up things to say in an editorial…
Mickey @ 10:16 AM

originator bias?…

Posted on Tuesday 1 September 2015

In our recent project, I had to bone up on my statistics. It was actually pretty interesting in that the statistical tests themselves haven’t changed all that much since my hard science days. But it wasn’t like riding a bicycle exactly, more like going to a class reunion where there’s an awkward start, but with a little catching up, the old familiarity returns.  While the statistics came back quickly, the implementation was all new. SPSS was unrecognizable. The newer SAS required SAS programming training. But then there’s R [just "R"], a free Open Source, command line statistical package put together by the academic community that’s a thing of great beauty. But learning the various procedures, each carrying the idiosyncrasies of its individual creator, meant going through a number of tutorials along the way.

That’s a very long introduction to this – a lot of the tutorials had examples from studies done by social psychologists. After all, who teaches the Statistics courses? Often statistics professors come from that very discipline. And over and over, working through the examples, I thought about the softness of the experiments compared to medicine [even psychiatry]. I don’t mean that disparagingly. It’s the nature of their subject matter. The study examples were kind of interesting in their own right, and I think it prepared me for this report about an article in Science [Estimating the reproducibility of psychological science] that was a major undertaking – having 100 studies from their main journals repeated by other unrelated groups and comparing the outcomes. I wasn’t as surprised as the press seemed to think I ought to be at the low reproducibility figures:
New York Times
by Benedict Carey
AUG. 27, 2015

The past several years have been bruising ones for the credibility of the social sciences. A star social psychologist was caught fabricating data, leading to more than 50 retracted papers. A top journal published a study supporting the existence of ESP that was widely criticized. The journal Science pulled a political science paper on the effect of gay canvassers on voters’ behavior because of concerns about faked data.

Now, a painstaking years long effort to reproduce 100 studies published in three leading psychology journals has found that more than half of the findings did not hold up when retested. The analysis was done by research psychologists, many of whom volunteered their time to double-check what they considered important work. Their conclusions, reported Thursday in the journal Science, have confirmed the worst fears of scientists who have long worried that the field needed a strong correction…

The vetted studies were considered part of the core knowledge by which scientists understand the dynamics of personality, relationships, learning and memory. Therapists and educators rely on such findings to help guide decisions, and the fact that so many of the studies were called into question could sow doubt in the scientific underpinnings of their work.

“I think we knew or suspected that the literature had problems, but to see it so clearly, on such a large scale — it’s unprecedented,” said Jelte Wicherts, an associate professor in the department of methodology and statistics at Tilburg University in the Netherlands…
New York Times
by Benedict Carey
AUG. 28, 2015

The field of psychology sustained a damaging blow Thursday: A new analysis found that only 36 percent of findings from almost 100 studies in the top three psychology journals held up when the original experiments were rigorously redone.

After the report was published by the journal Science, commenters on Facebook wisecracked about how “social” and “science” did not belong in the same sentence.

Yet within the field, the reception was much different. Along with pockets of disgruntlement and outrage — no one likes the tired jokes, not to mention having doubt cast on their work — there was a sense of relief. One reason, many psychologists said, is that the authors of the new report were fellow researchers, not critics. It was an inside job…
by the Open Science Collaboration
Science 349,aac4716 [2015].

Abstract:
Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.

Conclusion:
After this intensive effort to reproduce a sample of published psychological findings, how many of the effects have we established are true? Zero. And how many of the effects have we established are false? Zero. Is this a limitation of the project design? No. It is the reality of doing science, even if it is not appreciated in daily practice. Humans desire certainty, and science infrequently provides it. As much as we might wish it to be otherwise, a single study almost never provides definitive resolution for or against an effect and its explanation. The original studies examined here offered tentative evidence; the replications we conducted offered additional, confirmatory evidence. In some cases, the replications increase confidence in the reliability of the original results; in other cases, the replications suggest that more investigation is needed to establish the validity of the original findings. Scientific progress is a cumulative process of uncertainty reduction that can only succeed if science itself remains the greatest skeptic of its explanatory claims.

The present results suggest that there is room to improve reproducibility in psychology. Any temptation to interpret these results as a defeat for psychology, or science more generally, must contend with the fact that this project demonstrates science behaving as it should. Hypotheses abound that the present culture in science may be negatively affecting the reproducibility of findings. An ideological response would discount the arguments, discredit the sources, and proceed merrily along. The scientific process is not ideological. Science does not always provide comfort for what we wish to be; it confronts us with what is. Moreover, as illustrated by the Transparency and Openness Promotion [TOP] Guidelines [http://cos.io/top], the research community is taking action already to improve the quality and credibility of the scientific literature.

We conducted this project because we care deeply about the health of our discipline and believe in its promise for accumulating knowledge about human behavior that can advance the quality of the human condition. Reproducibility is central to that aim. Accumulating evidence is the scientific community’s method of self-correction and is the best available option for achieving that ultimate goal: truth.
I rearranged the frequency plots from the figure to clarify the central point. The effect sizes fell by a half and the number that were statistically significant by two thirds. I guess they expected some fall in reproducability, but nothing quite so dramatic. It’s a wake up call for their field, actually for all of us – replication being the gold standard in scientific experimentation and analysis:

Reading through this paper, I don’t think there was so much of the kind of problem we so often run into in the Clinical Trials of medication I follow in this blog. There was open sharing of protocols, materials, and methodology between the original investigators and the groups repeating the studies – no ghost writers or jury-rigged and obfuscated analyses. And yet the replication rate was still a lot lower than anticipated.

Even in this situation absent the on-purpose biases we deal with in many of the pharmaceutical trials, it seems like there’s an intrinsic bias present when someone conducts a study of their own design. I’ll bet the bias-ologists have a name for it. Looking only at the Effect Sizes, in a repeat study by a non-originator, the net strength of the effect generally falls, often precipitously – and it’s not just for the weaker studies, but to a lesser extent, all across the range:

I realize that I’m shamelessly co-opting this data for my own purposes, but I just thought it was striking that in this study-of-studies which is likely not so suffused with the on-purpose biases we’re looking for in RCTs of medications, the results of testing a pet hypothesis [or drug?] tend towards inflation [even without obvious "cheating"]. This may be a well known phenomenon that some commenter can tell us all about, but it’s not so well known by me. It’s all the more reason to be pristine in conducting a trial or experiment and in looking for independent confirmation – replication. Meta-analysis won’t correct for this kind of originator bias in that it’s usually a meta-analysis of a group of pet hypotheses…

I obviously spent some time thinking about this report. The authors seemed worried that they would discredit their discipline with this low reproducibility finding. I felt the opposite, impressed that they were examining the precision of their metrics. Because of the subjectivity of the social sciences, it felt like familiar territory to my own corner of things, psychotherapy, where confirmation is so ethereal and replication is king. But I also thought that it was a humbling reminder that our scientific evidence-based tools [our graphs, and tables, and statistics, etc.] are just crude attempts to simplify and objectify the world around us – mere proxies for the infinite variability of the nature we’re trying to understand…
Mickey @ 2:59 PM

important work…

Posted on Saturday 29 August 2015

Well look here. Those two guys from in the details…, Tom Jefferson and Peter Doshi, just popped up again. I didn’t know we’d hear from them again within the week! And they brought a new friend. I know that the topic of Clinical Study Reports [CSRs] isn’t the sexiest of blog topics, but important things are like that – hiding down in the cracks or behind a bush. These Clinical Trials and their reporting are in the absolute eye of a storm that threatens to over-run medical care with the influence of commercial interests. Not that commercial forces are intrinsically evil, but they’re hardly self regulating and need to be held in check by strong ethical and scientific watchdogs. So the work of people like Jefferson and Doshi is a vital piece of the quality of healthcare that will be delivered far downstream from their own labs and computers:
British Medical Journal
by Khaled El Emam, Tom Jefferson, Peter Doshi
27 Aug, 15

In late 2010, the European Medicines Agency [EMA] became the first regulator in history to promulgate a freedom of information policy that covered the release of manufacturer submitted clinical trial data. Under a separate, new policy [policy 070], the EMA will take an additional step and create a web based platform for sharing manufacturers’ clinical study reports [CSRs] upon a decision being made on a marketing authorization application or its withdrawal.

CSRs contain significant details that are often missing in journal publications of the same trials—for example, details pertaining to patient relevant outcomes and adverse events—and are an important new tool for those engaged in research synthesis. While the policy anticipates that the agency will require individual participant data [IPD] to also be shared, the EMA has not yet committed to a final timeline for this.

But as the EMA works towards finalizing its guidance on the anonymization of CSRs, some companies and industry initiated guidance may be promoting practices that would diminish the value of the data the regulator ultimately distributes. For example, one recent industry guidance favors the redaction and removal of significant standard content in CSRs, ostensibly in an effort to have simple rules for anonymizing these documents. This includes the removal of patient narratives [for example, of serious adverse events and patient dropouts]; line listings [tables of individual level information about participants]; and the redaction of all patient demographics, dates of birth, and other items such as event or assessment dates.

Simple rules have the advantage of being easy to understand and do not require much sophistication to implement. Unfortunately, the major disadvantage is the resulting extensive information loss across the board. CSRs are already written without the use of directly identifying personal information, and maintaining as much of the original information in the CSRs is important to be able to perform accurate analysis—for example, to evaluate the risk of bias of trials.

Thus far, the EMA’s draft guidance has erred towards less redaction of already partially anonymized CSRs, and away from blanket removal and redaction. It instead advocates a more nuanced risk analysis in compliance with recommendations from EU data protection authorities in order to maximize scientifically useful information in the CSR. The suggested approaches for further anonymization include selective masking/redaction, randomization, and generalization techniques…
When the European Medicines Agency announced that they would begin to put the Clinical Study Reports in the public domain, an enormous shot in the arm for Data Transparency. The pharmaceutical industry has mounted an equally enormous campaign to undermine that promise [remember the Abbvie/Intermune suits? see the timeline at ema data transparency…]. And Jefferson and Doshi are in the middle of that game – fighting to stop the EMA from backing down under pressure. They’re worrying here that if there’s not a strong demand for this data, momentum might be lost:
Today, demand for data is an important driver of investments in clinical trial data sharing infrastructure, and it is debatable whether demand is growing as rapidly as some have expected or hoped. The GSK initiated ClinicalStudyDataRequest.com portal reports 104 valid requests [as of end of June 2015]; Project Data Sphere reports 900 authorized users; and the EMA, under its 2010 freedom of information based policy, has released over 2 million pages of regulatory documents. While this access has resulted in some publications and high profile research, such as the Cochrane review of neuraminidase inhibitors, one hopes for more…
I understand their concern, having spent a couple of years working with a RIAT team on just such a  project. As I’ve said, it’s one thing to lobby for data transparency, but quite another to know what to do with it once you’ve got it.

For the moment, there’s no infrastructure or funding support for such enterprises, and it’s certainly a lot of work. We were on our own. But it was plenty rewarding and well worth the effort. It’s the perfect kind of project for graduate students and junior faculty who need a challenge that will flex a wide range of analytic skills with a definable output. I’m sure I’ve learned as much doing this as back during my jurassic age in an NIH fellowship. I can even imagine "watch-dogging" unpublished or questionable studies and reevaluating them as coming into the domain of some group like the Cochrane Collaboration or a similarly structured  independent academy.

But the point right now isn’t how to incorporate increased Data Transparency into some future formal scheme, because we don’t yet really even have that kind of access. Right now, all we can do is support the persistence of people like these "Tamiflu guys," Jefferson and Doshi, in their important work. The EMA has postponed the quest for their posting of the IPD [Individual Participant Data] which is required for a thorough vetting of trials, and we can only hope that the fight for that data will be waged with the same fervor…

ICH HARMONISED TRIPARTITE GUIDELINE
STRUCTURE AND CONTENT OF CLINICAL STUDY REPORTS – E3
30 November 1995
Mickey @ 8:00 AM