a road to nowhere…

Posted on Monday 12 August 2013


by Robert D. Gibbons, David J. Weiss, Paul A. Pilkonis, Ellen Frank, Tara Moore, Jong Bae Kim, and David J. Kupfer.
American Journal of Psychiatry. published on-line Aug 9, 2013

Objective: The authors developed a computerized adaptive test for anxiety that decreases patient and clinician burden and increases measurement precision.
Method: A total of 1,614 individuals with and without generalized anxiety disorder from a psychiatric clinic and community mental health center were recruited. The focus of the present study was the development of the Computerized Adaptive Testing–Anxiety Inventory [CAT-ANX]. The Structured Clinical Interview for DSM-IV was used to obtain diagnostic classifications of generalized anxiety disorder and major depressive disorder.
Results: An average of 12 items per subject was required to achieve a 0.3 standard error in the anxiety severity estimate and maintain a correlation of 0.94 with the total 431-item test score. CAT-ANX scores were strongly related to the probability of a generalized anxiety disorder diagnosis. Using both the Computerized Adaptive Testing–- Depression Inventory and the CAT-ANX, comorbid major depressive disorder and generalized anxiety disorder can be accurately predicted.
Conclusions: Traditional measurement fixes the number of items but allows measurement uncertainty to vary. Computerized adaptive testing fixes measurement uncertainty and allows the number and content of items to vary, leading to a dramatic decrease in the number of items required for a fixed level of measurement uncertainty. Potential applications for inexpensive, efficient, and accurate screening of anxiety in primary care settings, clinical trials, psychiatric epidemiology, molecular genetics, children, and other cultures are discussed.
The Computerized Adaptive Diagnostic Test for Major Depressive Disorder [CAD-MDD]:
A Screening Tool for Depression
by Robert D. Gibbons, Giles Hooker, Matthew D. Finkelman, David J. Weiss, Paul A. Pilkonis, Ellen Frank, Tara Moore, and David J. Kupfer.
Journal of Clinical Psychiatry. 2013 74[7]:669–674.

Objective:To develop a computerized adaptive diagnostic screening tool for depression that decreases patient and clinician burden and increases sensitivity and specificity for clinician-based DSM-IV diagnosis of major depressive disorder [MDD].
Method: 656 individuals with and without minor and major depression were recruited from a psychiatric clinic and a community mental health center and through public announcements [controls without depression]. The focus of the study was the development of the Computerized Adaptive Diagnostic Test for Major Depressive Disorder [CAD-MDD] diagnostic screening tool based on a decision-theoretical approach [random forests and decision trees]. The item bank consisted of 88 depression scale items drawn from 73 depression measures. Sensitivity and specificity for predicting clinician-based Structured Clinical Interview for DSM-IV Axis I Disorders diagnoses of MDD were the primary outcomes. Diagnostic screening accuracy was then compared to that of the Patient Health Questionnaire-9 [PHQ-9].
Results: An average of 4 items per participant was required [maximum of 6 items]. Overall sensitivity and specificity were 0.95 and 0.87, respectively. For the PHQ-9, sensitivity was 0.70 and specificity was 0.91.
Conclusions: High sensitivity and reasonable specificity for a clinician-based DSM-IV diagnosis of depression can be obtained using an average of 4 adaptively administered self-report items in less than 1 minute. Relative to the currently used PHQ-9, the CAD-MDD dramatically increased sensitivity while maintaining similar specificity. As such, the CAD-MDD will identify more true positives [lower false-negative rate] than the PHQ-9 using half the number of items. Inexpensive [relative to clinical assessment], efficient, and accurate screening of depression in the settings of primary care, psychiatric epidemiology, molecular genetics, and global health are all direct applications of the current system.
by Gibbons RD, Weiss DJ, Pilkonis PA, Frank E, Moore T, Kim JB, and Kupfer DJ.
Archives of General Psychiatry. 2012 69[11]:1104-12.

CONTEXT Unlike other areas of medicine, psychiatry is almost entirely dependent on patient report to assess the presence and severity of disease; therefore, it is particularly crucial that we find both more accurate and efficient means of obtaining that report.
OBJECTIVE To develop a computerized adaptive test [CAT] for depression, called the Computerized Adaptive Test-Depression Inventory [CAT-DI], that decreases patient and clinician burden and increases measurement precision.
MAIN OUTCOME MEASURES The focus of this study was the development of the CAT-DI. The 24-item Hamilton Rating Scale for Depression, Patient Health Questionnaire 9, and the Center for Epidemiologic Studies Depression Scale were used to study the convergent validity of the new measure, and the Structured Clinical Interview for DSM-IV was used to obtain diagnostic classifications of minor and major depressive disorder.
RESULTS A mean of 12 items per study participant was required to achieve a 0.3 SE in the depression severity estimate and maintain a correlation of r = 0.95 with the total 389-item test score. Using empirically derived thresholds based on a mixture of normal distributions, we found a sensitivity of 0.92 and a specificity of 0.88 for the classification of major depressive disorder in a sample consisting of depressed patients and healthy controls. Correlations on the order of r = 0.8 were found with the other clinician and self-rating scale scores. The CAT-DI provided excellent discrimination throughout the entire depressive severity continuum [minor and major depression], whereas the traditional scales did so primarily at the extremes [eg, major depression].
CONCLUSIONS Traditional measurement fixes the number of items administered and allows measurement uncertainty to vary. In contrast, a CAT fixes measurement uncertainty and allows the number of items to vary. The result is a significant reduction in the number of items needed to measure depression and increased precision of measurement.
Ever since Dr. Gibbons published his two-part meta-analysis opposing the black box warning on antidepressants, I’ve followed his publications with skepticism. As the title of my series [an anatomy of a deceit 1…] says, I thought those publications were deliberately obfuscated and questionable. Similarly, I had plenty of complaints about his next paper in November last year on a computerized psychometric for depression [really?…]. Since then, two more have appeared. One complaint is that this is obviously a commercial product [see their company’s web site, Adaptive Testing Technologies]. So I’m not keen on the fact that it was developed funded by the NIMH or that these high impact journals are publishing what are essentially advertisements for their new product. Nor do I particularly like the idea of using such an instrument in doctor’s waiting rooms [or anywhere else] as a tool to troll for patients to medicate. But my further complaints are other than the ones mentioned so far.

People in physicians’ waiting rooms, particularly new patients, have reason to be either depressed or anxious. They’ve noticed something wrong, a symptom that’s frightened them enough to send them to a doctor. I worry that using some instrument to identify anxiety or depression prior to their visit is a vulnerability. It identifies them as "mental" and may lead a busy physician to jump to conclusions and assume the questionnaire/computer data points to mental disorder rather than the expected emotional turmoil from a frightening symptom. We’ve all seen mental patients’ physical illnesses misdiagnosed repeatedly, and this kind of screening could create just such a problem. If a physician can’t identify clinically significant anxiety or depression without such an instrument, I think the proper course of action is for that doctor to take a refresher course in patient evaluation.

It’s not lost on me that the two diagnoses being evaluated here are Major Depressive Disorder and Generalized Anxiety Disorder, both of which flunked the DSM-5 Field Trials with Kappa’s of 0.32 and 0.20 respectively. Dr. Kupfer, co-chair of the DSM-5 Task Force, is an author on all three papers and surely knows the Field Trial results. In the discussion of the Field Trial outcome, they  explained this miserable showing away with:
Conditions that did not do well included major depressive disorder [MDD], in adults and in children, and general anxiety disorder [GAD]. According to Darrel Regier, MD, vice-chair of the DSM-5 task force, the poor scores for MDD may be attributable to "co-travelers," such as PTSD, major cognitive disorder, or even a substance use disorder, which often occur concurrently with depression. "Patients often don’t come in a single, simple diagnosis in clinical practice," Dr. Regier told Medscape Medical News.
It’s a little hard to generate any excitement for Dr. Gibbons’ correlations between his CAT-DI and CAT-ANX in the face of those results. Did they think we would forget the Field Trials?  But I have to admit that my negative reaction to these instruments is largely visceral, much like Dr. Nussbaum’s comments about physicians treating surrogates in the last post – looking at the computer rather than the person. The scope and importance of a diagnostic interview is so much greater than a search for brevity or precision. It’s a getting-to-know-you step that ought not be skipped. So, for me, these tests are just a further step down a road to nowhere leading away from the person looking for help…
  1.  
    Arby
    August 12, 2013 | 9:08 AM
     

    It appears the stage is being set for this. An article regarding global mental health check-ups recently posted here http://www.psychiatrictimes.com/blogs/time-mental-health-check-ups Note: site registration is required to view it.

    My concerns with it dove-tail some of the ones expressed here, and enough so, that I commented on it (under the name of Ruth). I was pleased they didn’t delete my comments and that the author responded to my concerns with compassion, even if in a decidedly politically correct manner.

  2.  
    Steve Lucas
    August 12, 2013 | 9:15 AM
     

    … and just think, with our new EMR’s these results can stay with us our entire life.

    Steve Lucas

  3.  
    Arby
    August 12, 2013 | 9:24 AM
     

    Steve,
    When I was relatively young and healthy, I was enamored with the convenience of EMR after I worked on a project to implement a pharmacy order entry system tied into one. Yet, now, for exactly what you said, I find them frightening.

  4.  
    Bernard Carroll
    August 12, 2013 | 9:54 AM
     

    Well said, Dr. Mickey. Aside from the commercial self interest of the developers (about which they were not candid in the CAT-DI paper last year, in violation of the journal’s disclosure policy), the chief reservation here is that these computerized instruments provide no clinical gain. I pointed that out in a letter to the journal last month and the authors responded with an unprofessional attempt to smear my motivation. In a second violation of the journal’s disclosure policy, they did not reveal the existence of their corporation while attempting to smear me. This attack was a smokescreen for their failure to respond to my substantive criticisms of their work.

    The essential flim-flam in these publications is the hand waving about speed and precision of assessment. The choice is not between just a few items with this method and hundreds of items as the alternative, but they go out of their way to give that impression for CAT-DI. If I were a patient I would be leery of a computer-administered “diagnostic screening” questionnaire that claims to label me as having major depression with high accuracy before any clinician has even spoken with me. Patients are not widgets to be sorted, boxed, labeled, and shipped with maximum speed and efficiency. Here is where the technological impulse in medicine loses its clinical grounding. Even worse, the CAT-generated diagnostic label is now in my electronic health record and will shadow me into the future, with no check on its validity.

  5.  
    Tom
    August 12, 2013 | 10:01 AM
     

    I am not sure what kind of a psychologist Gibbons is. I hope he is not a clinical psychologist. In Psychological Testing 101, it is taught that tests do not make diagnoses– clinicians do. Tests and scales can provide useful collateral information only if the results are integrated into a comprehensive diagnostic interview/assessment which should include a thorough analysis of contextual (medical and psychosocial) circumstances and factors bearing on the individual’s distress. Used alone, this computerized depression and anxiety stuff is crap– and malpractice.

  6.  
    wiley
    August 12, 2013 | 10:07 AM
     

    It is a cultural problem in the U.S.— thinking technocratically— that affects everything from the structure of public schools, the decision making of judges and lawyers, to medicine.

    With direct to consumer advertising, the colonization of minds with the bio-bio-bio hypothesis*, and the ubiquity of simplistic checklists like the ones Gibbons peddles, and too many patients believing that their suffering is a disease that can be fixed with medication; psychiatrists who want to rack em’ up and collect with the least amount of effort, and without having to listen to the patient, ask questions, and think very carefully are in the catbird seat. This system of diminution is a gold mine that doesn’t require much in the way of the emotional investment that is empathy or billable time.

    Perhaps finding out what is making a patient anxious might bring enough truth to light to make a clinician worry about their own reality? Be emotionally disturbing? Or, perhaps, a large number of psychiatrists are well-off before they go to med school, and feel so comfortable, secure, and entitled that they simply don’t consider the problems of the poor and struggling middle class to be real or relevant to mental health and emotional stability. That “excessive bitterness” was considered as a new disorder after the biggest financial disaster of all time robbed people of their homes and pensions; and caused unemployment, underemployment, and severe job insecurity is evidence of some serious class division.

    Question: What is the difference between anxiety, fear, and agitation? How would a smart and empathetic clinician tell the difference? (Assuming that there is a distinguishable difference.)

    * I won’t even call it a “theory” because science requires much more to to constitute a theory.

  7.  
    August 12, 2013 | 10:18 AM
     

    One additional problem with handing a patient a piece of paper in the waiting room to obtain a diagnosis is that the purpose of the physician/patient interaction is not just diagnostic. It’s not just about determining the best mode of treatment. It’s also about establishing a relationship between doctor and patient-an alliance. This is true even in primary care.
    ‘What’s really concerning is the possibility, no likelihood, that these instruments will be used by psychiatrists, and not just by primary care docs, who at least have the excuse of needing to treat everything that’s going on with the patient in 10 minutes.
    What would make more sense, and be less expensive than paying a high-profile shrink $500 for a 10 minute consult, would be an instrument to assess the patient, and a computer to print out the appropriate prescription. Why not cut out the middleman?

  8.  
    August 12, 2013 | 12:11 PM
     

    I’ve got a better idea, and even though the analogy is in a joke, it will happen with the way people pursue technology like it is oxygen and glucose:

    the joke:
    “A man complains to his friend “My elbow hurts — I’d better go to the doctor.”

    “Don’t do that,” his friend volunteers. “There’s a new computer at the drug store that can diagnose any problem quicker and cheaper than a doctor. All you have to do is put in a urine sample, deposit $10, and the computer will give you your diagnosis and a plan of treatment.”

    The man figures he has nothing to lose, so he goes down to the drug store. Finding the machine, he pours in the urine and deposits $10. The machine begins to buzz and various lights flashed on and off. After a short pause, a slip of paper pops out which reads:

    You have tennis elbow. Soak your arm in warm water twice a day. Avoid heavy labor. Your elbow will be better in two weeks.

    That evening, after some contemplation, the man begins to suspect fraud and decides to test the machine. He mixes together some tap water, a stool sample from his dog, and urine samples from his wife and teenage daughter. To top it all off, he masturbates into the jar.

    He takes this concoction down to the drug store, pours it into the machine, and deposits $10. The machine goes through the same process, buzzing and flashing before finally printing out the following message:

    Your tap water has lead.
    Get a filter.
    Your dog has worms.
    Give him vitamins.
    Your daughter is on drugs.
    Get her in rehab.
    Your wife is pregnant.
    It’s not your baby — get a lawyer.
    And if you don’t stop jerking off, your tennis elbow will never get better.”

    So, let these computer programs tell you who is anxious, depressed, psychotic, PTSD, ADD, hell, they’ll even be able to determine if there are personality disorder factors too. And then the treatment will be spit out after the sheet telling you your diagnosis.

    Except the sheet will be more polite than the real comment should be: “You are f—-d!” As the pills will be the exclamation point to the experience.

    Enjoy!

  9.  
    a-non
    August 12, 2013 | 4:37 PM
     

    “Your tap water has lead.
    Get a filter.
    Your dog has worms.
    Give him vitamins.
    Your daughter is on drugs.
    Get her in rehab.
    Your wife is pregnant.
    It’s not your baby — get a lawyer.
    And if you don’t stop jerking off, your tennis elbow will never get better.”

    Joel Hassman, MD

    “Dave, it would appear that you’ve made some poor lifestyle choices. Would you like a prescription for antidepressants?”
    https://en.wikipedia.org/wiki/File:HAL9000.svg

  10.  
    TinCanRobot
    August 12, 2013 | 11:54 PM
     

    I looked through the 2 papers that were linked linked, I noticed neither had a confidence level. So I guess they didn’t really perform statistical analysis, but just made some math up?

    There’s one good reason such a test would simply never work, frankly it’s the same problem with everything else.

    Statistical analysis is field for determining how likely a sample did not occur by pure chance. It’s part of the cornerstone of the scientific method, we can’t prove ‘truth’ or otherwise, all we can really do is show with what degree of confidence our observations didn’t merely happen by chance.

    The scientific method requires us to reach a 98% Confidence Level, which is 3 standards deviations. Not surprisingly, the types of subjective observations use in psychiatry can not exceed 2 standard deviations (95% CL). Can you imagine having a 5% chance that everything merely happened by chance? Thankfully a 95% confidence level is considered unacceptable in the scientific method.

    None the less I’m sure if they think they can make money they’ll apply it somewhere regardless.

    Mickey, if you haven’t already posted one, could you perhaps explain what this ‘Kappa” thing is is invented for the DSM Field Trials? I couldn’t find an explanation for it anywhere, and when i tried to explain the appearance to someone better versed then in Statistical Analysis then I, eyebrows went up.

    I really enjoy reading this blog, it has a lot of interesting stuff!

Sorry, the comment form is closed at this time.