study 329 – something new…

Posted on Wednesday 28 September 2016

Well our second Paxil Study 329 paper was published at the end of last week. I waited to mention it here until David Healy had a post about it – out today [see Study 329 Continuation Phase]. We originally submitted it to the Journal of the American Academy of Child and Adolescent Psychiatry who turned it down [their peer review comments are on our website Restoring Study 329 – interesting in their own right]. I think what I’ll do is show a couple of graphs from that data, then reverse my usual m.o. by talking about it first and ending with the abstract:

Paxil Study 329 had a Continuation Phase where they followed the responders only, blinded on the same meds for six months. In the a priori Protocol, it was a Secondary Outcome Variable hoping to measure the relapse rate. They didn’t mention it in Keller et al. I think they must’ve looked at that upper graph of the drop-out rate and shied away from the Continuation Phase altogether. The lower graph has the Raw HAM-D scores and, as expected, they showed no differences. But we never said that this was a badly designed study. To the contrary, it’s better than most and this six month follow-up data is about the only longer term SSRI dataset around, certainly in kids – so we decided to take a look.

In our original RIAT paper [Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence], we wanted to analyze the data as it should have been analyzed in the first place – according to the a priori Protocol [see faith-based medicine…]. We were able to do that with the efficacy data. We couldn’t exactly do that with the safety analysis. For one thing, the Protocol didn’t specify a system. And for the other, the system used by Keller et al obscured suicidality. So we used a more modern, more appropriate system. But even with that change, we remained in the Hypothesis Testing mode. However, with this Continuation Phase, from my point of view we were doing something else – called Material-Exploration by Adrianne de Groot in his 1956 classic [The Meaning of “Significance” for Different Types of Research] [see also the hope diamond…]:
Hypothesis Testing Research vs Material-Exploration

Scientific research and reasoning continually pass through the phases of the well-known empirical-scientific cycle of thought: observation – induction – deduction – testing [observe – guess – predict – check]. The use of statistical tests is of course first and foremost suited for “testing”, i.e., the fourth phase. In this phase one assesses whether certain consequences [predictions], derived from one or more precisely postulated hypotheses, come to pass. It is essential that these hypotheses have been precisely formulated and that the details of the testing procedure [which should be as objective as possible] have been registered in advance. This style of research, characteristic for the [third and] fourth phase of the cycle, we call hypothesis testing research.

This should be distinguished from a different type of research, which is common especially in [Dutch] psychology and which sometimes also uses statistical tests, namely material-exploration. Although assumptions and hypotheses, or at least expectations about the associations that may be present in the data, play a role here as well, the material has not been obtained specifically and has not been processed specifically as concerns the testing of one or more hypotheses that have been precisely postulated in advance. Instead, the attitude of the researcher is: “This is interesting material; let us see what we can find.” With this attitude one tries to trace associations [e.g., validities]; possible differences between subgroups, and the like. The general intention, i.e. the research topic, was probably determined beforehand, but applicable processing steps are in many respects subject to ad hoc decisions. Perhaps qualitative data are judged, categorized, coded, and perhaps scaled; differences between classes are decided upon “as suitable as possible”; perhaps different scoring methods are tried along-side each other; and also the selection of the associations that are researched and tested for significance happens partly ad-hoc, depending on whether “something appears to be there”, connected to the interpretation or extension of data that have already been processed.

When we pit the two types so sharply against each other it is not difficult to see that the second type has a character completely different from the first: it does not so much serve the testing of hypotheses as it serves hypothesis-generation, perhaps theory-generation — or perhaps only the interpretation of the available material itself…
If you only take one thing away from this entire 1boringoldman blog, let this be it. What’s been wrong with the clinical trial literature is that the papers are written as if they are some kind of anything-goes, free-wheeling, Material Explorations with changing outcomes, creative statistics, and speculations-presented-as-facts. That’s dead wrong. They are Hypothesis Testing enterprises that require every bit of the rigor and attention to protocol described by de Groot. Product Testing exercises, not Exploratory Research! Hypothesis Testing not Material-Exploration! …End of Sermon…

Now back to our Paxil  Study 329 Continuation Phase paper. I’m not even going to try to summarize it because fellow author David Healy has done such a good job in Study 329 Continuation Phase. He and Jo Le Noury have a collective knack for looking at adverse event data. We did find some things after all, in spite of the drop-out rate – primarily by looking closely at the timing and various states of medication use. So look over the paper and be sure to read David’s posts, the one today and the one coming next week, for the details of what we found. Some pretty interesting Material Explorations in my book. Here’s another graphic and the abstract:

by Le Noury, Joanna; Nardo, John M; Healy, David; Jureidini, Jon; Raven, Melissa; Tufanaru, Catalin; and Abi-Jaoude, Elia.
International Journal of Risk & Safety in Medicine. 2016 28[3]:143-161.

OBJECTIVE: This is an analysis of the unpublished continuation phase of Study 329, the primary objective of which was to compare the efficacy and safety of paroxetine and imipramine with placebo in the treatment of adolescents with unipolar major depression. The objectives of the continuation phase were to assess safety and relapse rates in the longer term. The objective of this publication, under the Restoring Invisible and Abandoned Trials [RIAT] initiative, was to see whether access to and analysis of the previously unpublished dataset from the continuation phase of this randomized controlled trial would have clinically relevant implications for evidence-based medicine.
METHODS: The study was an eight-week double-blind randomized placebo-controlled trial with a six month continuation phase. The setting was 12 North American academic psychiatry centres, from 20 April 1994 to 15 February 1998. 275 adolescents with major depression were originally enrolled in Study 329, with 190 completing the eight-week acute phase. Of these, 119 patients [43%] entered the six-month continuation phase [paroxetine n=49; imipramine n=39; placebo n=31], in which participants were continued on their current treatment, blinded. As per the protocol, we have looked at rates of relapse [based on Hamilton Depression Scale scores] across both acute and continuation phases, and generated a safety profile for paroxetine and imipramine compared with placebo for up to six months. ANOVA testing [generalized linear model] using a model including effects of site, treatment and site x treatment interaction was applied. Otherwise we used only descriptive statistics.
RESULTS: Of patients entering the continuation phase, 15 of 49 for paroxetine [31%], 12 of 39 for imipramine [31%] and 12 of 31 for placebo [39%] completed as responders. Across the study, 25 patients on paroxetine relapsed [41% of those showing an initial response], 15 on imipramine [26%], and 10 on placebo [21%]. In the continuation and taper phases combined there were 211 adverse events in the paroxetine group, 147 on imipramine and 100 on placebo. The taper phase had a higher proportion of severe adverse events per week of exposure than the acute phase, with the continuation phase having the fewest events.
CONCLUSIONS: The continuation phase did not offer support for longer-term efficacy of either paroxetine or imipramine. Relapse and adverse events on both active drugs open up the risks of a prescribing cascade. The previously largely unrecognised hazards of the taper phase have implications for prescribing practice and need further exploration.
    September 28, 2016 | 11:53 PM

    Did you really classify “abnormal dreams” and “depersonalization” as potential suicidal events?

    September 29, 2016 | 12:59 AM


    I think you’re referring to Table 13. The graphic posted at the end here clarifies that table. The events you mention are what I called “AGITA” or “Agitation” in Healy’s post. The concrete SUICIDAL ACTS and SUICIDAL IDEATION are clearly separated out. The explanation of the “why” of Table 13 is in the second paragraph under 2.7 Analysis of safety data. Obviously, the hypothesis here is that an internal state of agitation lies on a continuum with the more ominous suicidal ideation and behavior, and we were casting a broad net to collect that information. Looking at the graphic, the three things do appear to “run together.” We weren’t trying to pad the numbers. We were intending to explore that association…

    September 29, 2016 | 2:15 AM

    I was cueing off Reviewer #4’s comment (paragraph 5). I should look at paragraph 2 of Analysis of Safety Data. Thanks.

    September 29, 2016 | 8:03 AM

    I had earier picked that term AGITA based on my own clinical experience which colors my response to your question. When Prosac® first came out, I prescribed it some, often at the request of the patients [back then, it was written up everywhere]. And I had a few patients return saying “I can’t take that!” describing some kind of agitated state. Actually, I heard about it more from new patients who had been prescribed it elsewhere and had decided to seek psychotherapy. At first, I thought it was a Prosac®-only thing, but I also heard about it with some of the other SSRIs as well.

    It was much later [after retiring] when I saw adolescents and tried SSRIs that I saw it more frequently. Then I had a case with the Full Monty with suicidal/homicidal thoughts. That’s when I went to the literature and found the term Akathisia used in this way. In fact, it was that specific experience that originated this blog’s focus. So my anectdotal-self thinks that some kind of AGITA and the more ominous suicidality/homicidality are flavors of the same thing. It may not meet the standards of reviewer #4, but when I see those things, I stop the drug. And it’s part of my do-no-harm spiel starting an SSRI in a new patient, “if blah-blah happens, stop the medication and call me.”

    I don’t personally think that our graphic “proves” this association, but it’s mighty suggestive. And as I say above, in my eyes, this is a Material-Exploration paper of the only closely observed longer-term dataset of kids on SSRIs. Everything else comes from experience-distant reviews of population-level databases which I don’t find very helpful. And our ending remarks are sincere. We’re wide open to other takes on this data…

    David Healy
    September 29, 2016 | 10:27 AM

    In Table 13 we have tried to capture the full gamut of behavioral activation and disinhibition events. In trial settings there is no exploration of these events. The person on treatment may hint at a disconnection, may hint at altered dreaming. Pretty often if explored these turn out to be dreams of intense violence toward self and others. Among these drugs Cymbalta is particularly noted for this. Hints that may end up being coded as depersonalization will often refer to someone feeling they are considerably more likely to do something violence toward themselves or others and have to inhibit themselves in a much more deliberate or cognitive way than normal.

    What we present in Table 13 is all events that either are explicitly suicide related homicide related or other significant behavioural events and all events that should at the time have been explored further – events where we cannot and should not rule out the possibility that these may have been just as significant.

    Our invitation to anyone interested in these issues to go into the data and look closely at it and see if they can see any other pointers that would help toward a definitive coding still stands. We make no claims to offering the correct or definitive coding of any events. But if its not possible to eliminate some of these events, then based on what happens clinically and in clinical trials of these drugs, we should take into account the possibility that all the events included in the table here are significant behavioural events.

    The data of course is also randomized and we have included many events under the heading of placebo that no-one has thought to include. There is though a much greater number of these events on active treatment and the profile of the difference between active treatment and placebo remains pretty much the same whether you use loose or tight criteria, and whether you look at acute or continuation phase and perhaps also taper phase.

Sorry, the comment form is closed at this time.