the appearance of conflict of interest…

Posted on Saturday 9 April 2011

I’m still on STAR*D, and I put this first part in a box for two reasons – it’s convoluted, in spite of my attempts to make it clear. You might even skip it if you have an aversion to initials and technical talk. My second reason is so you can find it when you get to the end and decide you want to look it over again [or for the first time]:

In looking at the STAR*D study [recalculating…], there was a very confusing part about changing the primary out come measures along the way – from the standard HRSD [Hamilton Rating Scale for Depression] to the QIDS-SR [Quick Inventory of Depressive Symptomatology–Self-Report]. If you don’t remember it, you might want to look back at my post about it. It turned out that that change was never approved by the NIMH DSMB [Data Safety and Monitoring Board]. It turns out that this is not just some picky point, because this new outcome measure was not blinded during the study and available to the treating clinicians along the way. What made it worse, in the paper, the authors said, "the QIDS-SR was not used to make treatment decisions, which minimizes the potential for clinician bias." I claimed that was a lie. But looking back at the exact wording in the paper,
    The clinical research coordinators also completed the 16-item, clinician-rated Quick Inventory of Depressive Symptomatology [QIDS-C16] at each clinic visit to assess symptoms over the prior week… Patients also completed the 16-item Quick Inventory of Depressive Symptomatology-Self-Report [QIDS-SR16] and the Frequency, Intensity, and Burden of Side Effects Rating at each clinic visit.
maybe they’ve got me on a technicality. The QIDS-C and the QIDS-SR are identical. Maybe they are trying to sell us that the clinician rated version was open, and the self-report was blinded. Since they’re identical and were done in the same visit, their statement is certainly not the truth. And then they used the QIDS-IVR  [Quick Inventory of Depressive Symptomatology–Interactive Voice Response] which is a call-in telephone system version to follow subjects for a year to detect relapses. Here’s how they put it:
    We used the Quick Inventory of Depressive Symptomatology–Self-Report (QIDS-SR) as the primary measure to define outcomes for acute and follow-up phases because:
      1. QIDS-SR ratings were available for all participants at each acute treatment clinic visit

      2. QIDS-SR and HRSD outcomes are highly related
      3. the QIDS-SR was not used to make treatment decisions, which minimizes the potential for clinician bias
      4. the QIDS-SR scores obtained from the interactive voice response system, the main follow-up outcome measure, and the paper-and-pencil QIDS-SR16 are virtually interchangeable, which allows us to use a similar metric to summarize the acute and follow-up phase results.
    Response was defined as at least a 50% reduction from treatment step entry in QIDS-SR16 score. Remission was defined as a QIDS-SR16 score ≤5 [corresponding to an HRSD17 score of ≤7]. Relapse was declared when the QIDS-SR16 score collected by the interactive voice response system during the followup phase was ≥11 [corresponding to an HRSD17 ≥14].

None of this makes one iota of sense. Why change from a tried and true standard like the HRSD [Hamilton Rating Scale for Depression] to the QIDS-SR [Quick Inventory of Depressive Symptomatology–Self-Report] which none of us have ever heard of? Why say that the change was approved by the NIMH DSMB when it wasn’t? Why would you count on a telephone call-in system as your main follow-up measure? It sure didn’t work – in fact it may have made the outcome virtually unusable:

How many relapsed and how many dropped out? Who knows? I can’t figure it out or find anyone else who can. This table by Pigott et al is as good as it gets. But it suggests that the call-in system didn’t exactly pull in the data. These numbers either indict the treatment or the call-in data collection [or both].

So far, we’ve got a confusing box and an inconclusive table. It seems only fitting to follow that up with unprovable suspicions.

The QIDS-SR was introduced in 2003 by Rush, Trivedi, et al [accepted for publication Nov 2002]. There are  nine articles in PubMed with "QIDS" in the title [six from their group], all validating the scale. So it seems to have been developed for or around the time of STAR*D. I can only find one study comparing the QIDS-C, QID-SR, QID-IVR, and the HRSD — written by the STAR*D group and published a few months before the STAR*D report in 2006. The correlation looks okay to me:

When I was looking over Dr. Pigott’s studies, he and I had some correspondence, and among the things he sent was a piece of his sleuthing he didn’t publish because it was conjecture. I thought it was worth saying and with his permission, I’ll add it here.

Dr. Rush was closely involved with developing both the IDS [Inventory of Depressive Symptomatology] and the QIDS [Quick Inventory of Depressive Symptomatology]. Dr. Pigott noticed something about their availability [from]:

The 30 item Inventory of Depressive Symptomatology (IDS) (Rush et al. 1986, 1996) and the 16 item Quick Inventory of Depressive Symptomatology (QIDS) (Rush et al. 2003) are designed to assess the severity of depressive symptoms. Both the IDS and the QIDS are available in the clinician (IDS-C30 and QIDS-C16) and self-rated versions (IDS-SR30 and QIDS-SR16). The IDS and QIDS assess all the criterion symptom domains designated by the American Psychiatry Association Diagnostic and Statistical Manual of Mental Disorders – 4th edition (DSM-IV) (APA 1994) to diagnose a major depressive episode…

Current translations of the pencil and paper versions of the IDS and QIDS are available at no cost to clinicians and researchers. Copies may be downloaded from this site and used without permission. The IDS and QIDS are available in an automated telephone-administered format (IVR) exclusively licensed to Health Technology Systems. Those wishing to consider the IVR versions or other electronic versions should contact: Healthcare Technology Systems, Inc.
And Healthcare Technology Systems was in fact the provider of STAR*D’s telephonic IVR system that was used to capture 8 of its 11 pre-specified research outcome measures, including the IVR-administered version of the QIDS. HTS is mentioned all through the STAR*D procedures manual . Then if we look at the STAR*D article in the disclosures at the end of the article, we see:
    Dr. Rush has served as an advisor, consultant, or speaker for or received research support from … Healthcare Technology Systems, Inc…
    He has equity holdings in Pfizer Inc and receives royalty/patent income from Guilford Publications and Healthcare Technology Systems, Inc.
Dr. Pigott also found that in another STAR*D paper [New England Journal of Medicine. 2006 Mar 23;354(12):1231-42.], Dr. Rush reported that he also received consulting fees from and had served on Healthcare Technology Systems’ advisory board.

I guess I can see why Dr. Pigott didn’t include any of this in his publications or communications. It’s all suggestive but circumstantial, and he’s being careful not to say anything he can’t prove. I started to pass over it myself, but then I thought about what conflict of interest is really supposed to mean – anything that gives the appearance of bias. The standards for a scientist or a physician are different than they are in criminal court, at least they ought to be.

They apparently decided to use the QIDS as the primary outcome measure late in the game. The definitive study comparing the QIDS-C, QID-SR, QID-IVR, and the HRSD wasn’t even done until the STAR*D data was available. How do we know that other than its publication date being only a few months before the study itself? We know it because the study data itself was used to derive those correlations shown above.

But we don’t really know if the decision to drop the IDS-C or to not use the HRSD and substitute the not so widely known QIDS[s] had to do with the fact that Dr. Rush developed the scales, or because they made the data look better [which they apparently did], or because they had so many drop-outs that they were scrambling for numbers, or who-knows-what. And we don’t know if it had something to do with Dr. Rush’s connections with HTS, or if the royalty/payments were related to the QIDS-IVR licensing [which would be a hard core financial conflict of interest], or why they’ve never published the before and after HRSD or IDS at the different levels like they said they would. And we don’t have any way to know if using the QIDS-IVR was connected to their perserveration about measurement-based care
    ‘Finally, high quality of care was delivered (measurement-based care) with additional support from the clinical research coordinator. Consequently, the outcomes in this report may exceed those that are presently obtained in daily practice wherein neither symptoms nor side-effects are consistently measured and wherein practitioners vary greatly in the timing and level of dosing’
representing some fantasy that practitioners would be using the QIDS-IVR in their offices to follow their patient’s responses.

I can’t prove that any of these possibilities are true. But that’s the whole point of scientific research. I shouldn’t even be  in a position of wondering about any of them. That’s why we have the standard we have – nothing that gives the appearance of bias. And by that standard, the STAR*D paper should never have even been published.

Update: If you question who created the QIDS, check it out on the Southwestern web site. Take it yourself, read the disclaimers, and look at the Copyright near the bottom of the page.
    Bernard Carroll
    April 9, 2011 | 8:45 PM

    In the annals of clinical trials for depression there is no precedent for relying on patients’ self report as the primary outcome measure. Yet that is what NIMH allowed to happen here with STAR*D when they decided to run with the QIDS-SR16. What were they thinking – especially in a trial of this importance? I and others have long argued for more attention to patient self reports in clinical trials, but clinician rated measures remain essential.

    Moreover, in at least one of the early publications that aims to make a case for the validity of the QIDS-SR16 instrument, the QIDS-SR16 scores were simply extracted from the longer IDS-SR30 ratings. In other words, the QIDS-SR16 was not evaluated as an independent instrument. This sort of short-cutting sends shivers down the back of any self respecting psychometrician. [See MH Trivedi et al Psychological Medicine 2004; 34: 73-82.]

    Nancy Wilson
    April 9, 2011 | 11:59 PM

    On March 1, 2011, the Texas Department of State Health Services (DSHS) replaced TIMA with the DSHS Psychotropic Treatment Recommendations. The new recommendations included “…Stop requiring rating scales for the psychotropic treatment recommendations. These scales will remain part of the Uniform Assessment…”

    DSHS also replaced the TMAP algorithms for MDD with APA’s guidelines. See

    April 10, 2011 | 7:57 PM

    Thanks Nancy.

    And as for the QIDS, they’re less enthusiastic on the Southwestern website [see Update]:

    Keep in mind that your depression rating does not represent a formal diagnosis of depression. Instead, your rating indicates that you have some of the common symptoms associated with depression and, therefore, may have the illness. If you have answered all the questions as honestly as possible and you feel that the results of the test are accurate, you should consult a health care professional to obtain a formal diagnosis of depression if so indicated.

    Note: The above cutoff points are based largely on clinical judgment rather than on empirical data.

    Copyright 2000. A. John Rush, M.D. Quick Inventory of Depressive Symptomatology (Self-Report) (QIDS-SR). Used with permission.

    Nancy Wilson
    April 10, 2011 | 11:48 PM

    Mickey, I noted as well that the Southwesterners remain optimistic about the success rate of current treatments, according to their website: “Despite the recurrent nature of depression, depression is a highly treatable illness. The available therapeutic options are successful in 60-80% of all patients, so patients who receive treatment have a good chance of achieving remission.” See

Sorry, the comment form is closed at this time.