study 329 vii – variable variables?…

Posted on Thursday 17 September 2015

WARNING: Sometimes the devil is in the details. Here, we need to get pretty far into the details…

RATING SCALES


HAM-D     Hamilton Depression Rating Scale
K-SADS-L Kiddie Schedule for Affective Disorders – Lifetime
CGI Clinical Global Impressions Scale
AFC Autonomous Function Checklist
SPP Self-Perception Profile
SIP Sickness Impact Profile

It is essential in Clinical Trials to declare the specific outcome variables and the statistical analytic methodology in the a priori protocol [before the study begins]. That’s the only way to assure that the methodology hasn’t been adjusted, jury-rigged to make the data look a certain way. The standard is to not even allow for the possibility of mistrust. And any changes to the a priori protocol need to be added in as an amendment or modification, approved by the certifying agency ie the Institutional Review Board [IRB]. In a recent study of Clinical Trials in Psychiatry [Is Mandatory Prospective Trial Registration Working to Prevent Publication of Unregistered Trials and Selective Outcome Reporting?], only one third of the studies from the five journals with the highest impact rating in Psychiatry abided by this rule, and in the end, only 14% of the total cohort carried out the analysis precisely as described in the a priori protocol – even though it was a requirement for publication in all five journals.

The protocol for Paxil Study 329 was written well before the actual Clinical Trial began and was quite clear in reference to the outcome variables. They are unambiguously declared in the a priori protocol:
Primary Efficacy Variables
• Change in total HAMD score from beginning of treatment phase to the endpoint of the acute phase.
• The proportion of responders at the end of the eight week acute treatment phase. "Responders are defined as 50% or greater reduction in the HAM-D or a HAM-D score equal to or less than 8.
Secondary Efficacy Variables
• Change from baseline to endpoint (acute phase) in the depression items of the K-SADS-L, global impressions, autonomic function checklist, self perception profile and sickness impact scale.
• The number of patients who relapse during the maintenance phase.
Likewise, the requirements for changing the protocol are explicit:
PROTOCOL AMENDMENTS
No changes to the study protocol will be allowed unless discussed in detail with the SmithKline Beecham (SB) Medical  Monitor and filed as an amendment/modification to this protocol. Any amendment/modification to the protocol will be adhered to by the participating centre (or all participating centres) and will apply to all subjects following approval as appropriate by the Ethical Review Committee or Institutional Review Board.
Yet in the published 2001 paper, we find that some of the outcome variables have been changed [in the Efficacy and Safety Evaluation Section]:
"The protocol described two primary outcome measures: (1) response, which was defined as a HAM-D score of <8 or a >50% reduction in baseline HAM-D score at the end of treatment; and (2) change from baseline in HAM-D total score. Five other depression-related variables were declared a priori: (1) change in the depressed mood item of the HAM-D; (2) change in the depression item of the K-SADS-L; (3) Clinical Global Impression (CGI) improvement scores  of 1 (very much improved) or 2 (much improved); (4) change in the nine-item depression subscale of the K-SADS-L; and (5) mean CGI improvement scores."
Notice that the two primary outcome measures are identified as protocol described, whereas the remaining five are not labeled as coming from the protocol, nor are they called secondary outcome variables. Instead, they’re identified as depression-related variables and declared a priori. Then later in the article, there’s yet a different version in the Efficacy Results section text and Table 2 [page 766], adding in response:
Of the depression-related variables, paroxetine separated statistically from placebo at endpoint among four of the parameters: response (i.e., primary outcome measure), HAM-D depressed mood item, K-SADS-L depressed mood item, and CGI score of 1 (very much improved) or 2 (much improved) and trended toward statistical significance on two measures (K-SADS-L nine-item depression subscore and mean CGI score).

non-protocol variables in blue

I’ve colored the non-protocol variables blue. Those differences are the very devil in the details that are unexplained in the article itself. In fact, without having the protocol in hand, one would not likely notice them [and very few did  notice them]. Here’s a summary of the various versions:

This discrepancy is what caught the attention of a few early nay-sayers about this article. All four of the significant variables in Table 2 are nowhere mentioned in the protocol, so the designation, a priori, doesn’t really make a bit of sense. For that matter, why use the term depression-related rather than Secondary? And where did response [HAM-D <8] come from? In order to explain these changes, you have to have the 500+ page CSR [Clinical Study Report] available and a keen eye. And recall that the CSR only became available as part of the 2004 settlement between GSK and the State of New York.

So far, only questions, but some answers are just around the corner…
  1.  
    September 17, 2015 | 9:33 AM
     

    Mikey, if it’s not too late, I just want to thank you and the rest of the RIAT team for setting the record straight.

    Revisiting this whole sordid affair, for me at least, made me incredibly sad at the needless loss of life this one drug has caused.

    You and the rest of the RIAT team should take a bow.

  2.  
    September 17, 2015 | 9:59 AM
     

    Thanks Fid,

    You’ve been a valued resource all along the way. We need to give the BMJ credit too for being willing to publish it…

  3.  
    September 17, 2015 | 10:11 AM
     

    Yeh, I gave them credit in my latest blog post.

  4.  
    James O'Brien, M.D.
    September 17, 2015 | 10:43 AM
     

    Excellent analysis and we appreciate the effort involved. One down. Now onto the “childhood bipolar” studies…

  5.  
    Mark Hochhauser
    September 17, 2015 | 10:48 AM
     

    Their use of “trending” toward significance is not statistically supported.

    https://mchankins.wordpress.com/2013/04/21/still-not-significant-2/

    Non-significant results do not trend, and even if they did, they could be trending away from significance.

Sorry, the comment form is closed at this time.