what price, reliability?…

Posted on Monday 21 May 2012

    An Aside: I’m starting this with a bias. In 2012, I think the diagnosis Major Depressive Disorder used in the 1980 DSM-III and since has been a maelstrom – an egregious error with multiple negative consequences. So my looking back on its roots is not just a casual jaunt into history. It’s more in the vain of "how did this happen?"
I thought the journey meant going back to Spitzer’s 1978 paper on the RDC [Research Diagnostic Criteria: Rationale and Reliability] that came out not too long before the DSM-III went to press. But we really need to include another article [A Diagnostic Interview: The Schedule for Affective Disorders and Schizophrenia] published around the same time. At the end of his 1974 article demonstrating the unreliability of the criteria of the hour [A Re-analysis of the Reliability of Psychiatric Diagnosis], he said:
Several investigators have developed structured interview schedules which  an interviewer uses n his examination of the patient … These techniques provide for a standardized sequence of topics, and ensure that variability among clinicians in how they conduct their interviews and what topics they cover is kept to a minimum…

With respect to improving the nomenclature, the St.Louis group has offered a system limited to 16 diagnoses for which they believe strong validity evidence exists, and for which specified requirements are provided. Whereas in the standard system the clinician determines to which of the various diagnostic stereotypes his patient is closest, in the St. Louis system the clinician determines whether his patient satisfies explicit criteria…

So his plan had two prongs – a standardize data gathering system [the structured interview], and a criteria driven diagnostic system [Feighner Criteria, RDC, DSM-III]. The structured interview he was referring to was one he, himself, was working on – the SADS [Schedule for Affective Disorders and Schizophrenia]. The point of the scheduled interview is to control the interviewer style as a variable. It’s an interesting article, but I mention only one point, in the major scales they scored, three of eight were focused on depression and one of those specifically keyed to Endogeneous Depression:
A Diagnostic Interview: The Schedule for Affective Disorders and Schizophrenia
by Jean Endicott, PhD and Robert L. Spitzer, MD
Archives of General Psychiatry. 1978 35:837-844.
[full text on-line]


There are a number of items in the current section of the SADS that are descriptive of specific dimensions of psychopathology. An initial scoring system has been developed through the assignment of items to eight larger summary dimensions. The item assignment was based on knowledge of factor analytic work of scales with similar item content, consideration of the major clinical distinctions usually made in research studies of affective and schizophrenic disorders, and a desire to have a smaller number of clinically meaningful measures on which to compare individual subjects over time or groups of subjects with each other. The names of the summary scales and the item contents are shown in Table 2. The two syndromal scales, which have considerable overlap, were developed to describe features of the depressive syndrome that are frequently associated with depressive mood. The first, Endogenous Features, is limited to those items descriptive of symptoms that have traditionally been considered characteristic of an "endogenous" depressive episode. The other scale, Depressive Associated Features, is broader and includes both the "endogenous" items and some additional items generally assumed to be part of the depressive syndrome…

There was a lot of overlap between Endogenous features and Depressive-associated features:
Which is no big surprise since the latter contain the former ["Depressive Associated Features, is broader and includes both the 'endogenous' items and some additional items"]. So, we’re far enough down the page for a reminder of why any of this ancient history matters. Somewhere around this time a decision was brewing to lump a bunch of syndromes of depressive illness into one all encompassing category – Major Depressive Disorder – a decision that has endured for 32 years and if there is ever a DSM-5 under the current leadership, it will still be there for even longer. So what I’m looking for is why did the distinct syndrome we called Endogenous Depression [or other synonyms people thought were more palatable] disappear. Here it is prominently displayed in Dr. Spitzer’s SADS Structured Interview. It had fuzzy borders with another SADS category by the way things were defined. What we’re chasing here is Where did Endogenous Depression go? So we move to Spitzer’s other pre-DSM-III article about the Research Diagnostic Criteria, the precursor of the DSM-III.
Research Diagnostic Criteria: Rationale and Reliability
by Robert L. Spitzer, MD; Jean Endicott, PhD; and Eli Robins, MD
Archives of General Psychiatry. 1978 35:773-792.
[full text on-line]


A crucial problem in psychiatry, affecting clinical work as well as research, is the generally low reliability of current psychiatric diagnostic procedures. This article describes the development and initial reliability studies of a set of specific diagnostic criteria for a selected group of functional psychiatric disorders, the Research Diagnostic Criteria (RDC). The RDC are being widely used to study a variety of research issues, particularly those related to genetics, psychobiology of selected mental disorders, and treatment outcome. The data presented here indicate high reliability for diagnostic judgments made using these criteria…


The reliability of the RDC categories with psychiatric inpatients has been tested in three studies. The first two involved joint interviews whereby one rater conducted the interview and the other merely observed. Both made independent ratings. The third study involved a more rarely used procedure, whereby two independent raters interviewed the patient at different times [test-retest]. The kappa coefficients of reliability for these three studies are shown in Table 3…

Study B used the first edition of the RDC and involved pairs of raters at four facilities participating in a Pilot Study of the Psychobiology of the Depressive Disorders sponsored by the Clinical Research Branch of NIMH; the New York State Psychiatric Institute; Renard and Barnes Hospitals, Washington University School of Medicine; Iowa Psychiatric Hospital, University of Iowa Medical School, and Massachusetts General Hospital, Harvard Medical School. The subjects were newly admitted inpatients who met screening criteria for a depressive or manic syndrome. The SADS was used to interview the patients and an RDC diagnosis was made afterwards

Relationship Among Alternative Classifications of Depressive Disorders:

One of the main purposes of the RDC approach to psychiatric diagnosis is to facilitate the comparison of alternative classification systems for depressive disorders. Table 7 gives the joint classification of diagnoses for 90 patients with a current diagnosis of major depressive disorder (study B). The table should be read across so that the frequency with which subjects given a diagnosis on the left indicates how often they were also given a diagnosis listed on the right. Some of the cell sizes are quite small; therefore, this table is presented primarily for illustrative purposes.

Frequently, there is an assumption that the more commonly used methods for classifying depressed patients are equivalent and that the results of studies using these different systems can be easily compared. For example, it is often assumed that episodes of primary depressive disorder would almost always meet the criteria for endogenous depressive disorder and rarely meet the criteria for situational [reactive] depressive disorder. However, only 64% of patients with a diagnosis of primary depressive disorder also met the criteria for endogenous phenomenology, while 51% of them met the criteria for situational depressive disorder. Similarly, it is often assumed that situational [reactive] depressive episodes would rarely meet the criteria for endogenous depressive disorder whereas they actually met those criteria 42% of the time…

[reformatted to fit the page · click image to view full size]

I admit to not being totally clear about how Table 7 was derived. It’s from Study B, in which one person did the interview using the SADS protocol and the other was an observer. I gather that under Major Depressive Disorder, the raters could chose multiple diagnoses.

…thus accounting for the large % numbers. But the point of the table and the last paragraph quoted from the article above is clear. The table says that the categories we thought of as separate conditions at the time did not separate well in the Study B – for example Endogeneous Depression and Situational Depression [overlapping 37% and 42% depending on which came first]. It’s obvious, given the etiological implications of those names, that the names were not exactly candidates for the purely descriptive DSM-III being constructed. Don Klein had even suggested a cause-neutral term "endogenomorphic" to replace "endogenous." I presume "situational" was associated with "neurotic," a causal term that had to go. But while the terms and their former implications were problems, that could’ve easily been surmounted. I suspect that the values in Table 7 suggested that previous distinctions were not real, mythological, and supported the "lumping" of all of the categories under the one roof – Major Depressive Disorder.

Looking over and over this latter paper, and the SADS interview used to gather this data, I couldn’t get close enough to what they actually did to assess it. The details were just too obscure and I haven’t located the criteria they used for the subclasses. But I was suspicious that those numbers in Table 7 were not an accurate picture. They don’t fit my experience or that of many others. But beyond that, there’s another consideration that seems ignored. Even if they couldn’t find the reliable separation in the varieties of depression, that means to me back to the drawing board, not that those distinctions are necessarily mythic. And it for sure doesn’t mean that all significantly depressed people have the same "Disorder." Lumping, in this case, wasn’t just a passive act. It was an active declaration of unity without any backing of its own – one that had enough negative consequences to deserve the term maelstrom.

I’m going to keep looking for the details of the process discussed in this post because it nags at me, as it has for three decades. But I’m sure that it’s not definitive evidence that these traditional distinctions, now virtually lost, were simply opinions. I continue to believe that defaulting to the now sacrosanct unitary category remains the Achilles Heel of the DSM-III and beyond – one that has never really been seriously or intensively revisited. Instead, we’ve been lulled into a functional classification of depression vs treatment-resistant depression – a fractionation of depressive illness by drug treatment that moves us nowhere. And now with the DSM-5 field trials, the venerated reliability doesn’t even hold the category together. There’s plenty of new stuff in the DSM-5 to be concerned about, but some yawning old things still haunt its pages. This is one of them…
    May 21, 2012 | 12:44 PM

    Mickey, I’m a long-time reader of your blog and have a paper in draft with some of my colleagues here at UNC Chapel Hill that takes aim at the MDD diagnostic category from a somewhat unique angle. I couldn’t find an email address for you, but if you’re interested I would love to hear your thoughts on the draft, which we’re hoping to submit to a journal soon. My email address ought to come along with the comment, so please feel free to send me an email if you’d like.

    Thanks for considering it, and thanks for this blog.

    Steve Lucas
    May 21, 2012 | 3:30 PM

    I have casually followed the drug industry for a number of decades. What was made clear to me in an undergraduate class so many years ago was that pharma was all about selling drugs, and they will do anything to accomplish this goal.

    While there has been a constant onslaught of the general medical community I feel they have taken a particular interest in psychiatry. In this subset of general medicine we have a small close knit leadership structure combined with a small general doctor population. The ability to manipulate this small group with grants and fuzzy diagnostic standards becomes apparent.

    Something I have found in all of medicine is that old ideas are hard to kill. Thus, pharma will take time developing a concept and moving it into the general consciousness of the medical community and then capitalize on its acceptance in promoting other drugs at a later date. Doctors are then willing to accept this new diagnosis or drug based on these “facts.”

    Depression is a very real condition, but not everyone is depressed. This has not stopped pharma from moving into the general doctors’ offices and stating “you know your patients best” while trying to promote the use of their drugs, which are of questionable value.

    Today I read a post where the old idea of putting statins in the water or placing every US citizen over the age of 50 on a statin is of value to society. Statins meet a number of pharma criteria; at a low dose they are fairly benign, and they produce a measurable change. There are many questions as to the validity of this change, but the pharma issue is it can be identified.

    Additionally they break the patient’s resistance to taking a medication and they establish the government’s ability to medicate the entire population. This is the ultimate goal of pharma.

    This blog challenges me to go back and revisit concepts not viewed in decades. I am also acutely aware that pharma’s goals have not changed in those decades. Pharma is a master of psychology, their goal is simply to sell as many drug to as many people as possible, needed or not.

    Fuzzy math, skewed studies, hiding information, ghost writing and even buying off big names are all part of this never ending goal of everyone being medicated, and once that happens, more is better.

    Steve Lucas

    May 21, 2012 | 3:42 PM

    One thing about being retired, one actually has time to look back without the onslaught of daily work. I would never have predicted that it would become an abiding interest, and it wasn’t for 5 years. But it’s now of great interest to look back through it and take some time seeing at how the things you mention happened. I say great interest, but sometimes horror might be a better word…

    May 21, 2012 | 10:58 PM

    In the last 20 years, pharma has struck gold in vaguely defined conditions that might affect anyone, and have no recognizable resolution.

    Doctors should have seen through this a long time ago. Now it will take a great deal of time and effort to re-educate them.

    Bernard Carroll
    May 22, 2012 | 3:17 AM

    A telling omission from the final Table above is the term ‘bipolar.’ Cases of bipolar depression were treated like unipolar depression in the Research Diagnostic Criteria of the mid-1970s, in keeping with Kraepelin’s early concept of a single morbid process. It is still the case that bipolar disorder is a recognized outcome of what appears at first to be unipolar depression and that the most common psychiatric disorder in the families of bipolar probands is unipolar depression – not bipolar disorder. Moreover, lithium is effective in preventing new episodes of recurrent unipolar depression, just as in preventing new manic or depressive episodes of bipolar disorder.

    One would have hoped that the DSM-5 people would reconsider the relatedness of bipolar and recurrent unipolar disorders, but alas no. That would open the door to subtyping of the unipolar depressions and abandonment of the strategic blunder that gave us Major Depressive Disorder. Heaven forfend! So, with DSM-5 the epistemologic quagmire that DSM-III gave us will continue and research in mood disorders will continue to be crippled.

    May 22, 2012 | 4:42 PM


    Once-Banned Psychiatrist Receives First NIH Grant in 3 Years

    May 22, 2012, 3:42 pm

    A university researcher who has drawn fire for his history of accepting undisclosed payments from drug companies has received his first National Institutes of Health grant in three years, according to Science magazine. Charles B. Nemeroff, chairman of the department of psychiatry and behavioral sciences at the University of Miami, stepped down in 2008 as chair of the psychiatry department at Emory University’s medical school after Senate investigators accused him of failing to disclose money he received from pharmaceutical companies. Emory hit Dr. Nemeroff with a two-year ban on applying for NIH grants, and he moved on to a new post at Miami. Later rule changes tried to tighten the regulations regarding financial conflicts of interest in science research, but did not deal with the maneuver that allowed Dr. Nemeroff to move to a new institution and escape sanctions. Dr. Nemeroff’s two-year ban would have expired, Science noted, and his new grant will be used to study risk factors for posttraumatic stress disorder. It does not appear to involve testing drugs, according to Science.

    donald klein
    May 22, 2012 | 7:12 PM

    Re Endogenomorphic
    Worthwhile-I think-to look up original article rather than speculate about names.
    Klein DF. Endogenomorphic depression: A conceptual and terminological revision. Arch Gen Psychiatry 1974; 31: 447-451.
    The point was in this psychoanalytic hospital depressive onset after a loss etc was considered psychogenic to be rx with psychotherapy whereas those out of blue were endogenous. However we found that the endogenous did well with meds as did the precipitated= “non-endogenous” if they shared the symptoms of the “endogenous”. So precipitation was heterogeneous re response ,and this could be predicted by symptom profile =endogenomorphic=like endogenous appearance. I believe Gerry Klerman had similar results.
    Don Klein

Sorry, the comment form is closed at this time.