all databases are not created equal…

Posted on Thursday 3 July 2014

The Black Box Warning added to antidepressant labeling in 2004 by the FDA created something of a cottage industry attempting to either discredit it or  reverse it. Those efforts are well documented throughout this and many other blogs. And there have been multiple attempts to say that the net effect has been an increase in suicidality resulting from the fall in prescribing attributed to untreated depression in children and adolescents. This claim has been based on data from large databases from several sources.

This most recent article in the British Medical Journal [Changes in antidepressant use by young people and suicidal behavior after FDA warnings and media coverage: quasi-experimental study] relies primarily on the Mental Health Research Network [3], data from multiple large health plans [not for profit HMOs]. But there are other datasets involved in the analysis: PHARMetrics Patient Centric Database [1], BC Ministry of Health, and the Nationwide Inpatient Sample [2] – the latter two chosen because they include complete ICD-9 E-Codes [codes for external injury]. PHARMetrics is a commercial enterprise used for marketing intelligence [I’ve extracted the descriptions of these datasets from the referenced papers at the end of this post with links to their abstracts and full texts when available. I also added some charts from the reported ages.].

While I’m still grumbling about the Lu et al papers [3], my main agenda in this post is something else – the use of these large databases in research papers. The story begins with a problem – the commercial databases don’t have usable E-Codes, the ICD-9CM codes that directly address suicide attempts. In their Letter to the Editor [3], Lu et al prove that the database they are using doesn’t have these E-Codes with any consistency across sites. So they bring up Patrick et al [2] who did a study trying to deal with the missing E-Codes by locating other parameters that could be used as a proxy for suicide attempts. They tried a number of combinations and came up with an algorithm involving psychiatric diagnosis and injury to the lower arm or asphyxiation or overdose with psychotropic medications using two government databases that had required E-Codes as their gold standard.

In their Letter to the Editor [3], they essentially rewrite Patrick et al’s paper by saying:
"Patrick et al. developed and tested algorithms for identifying hospitalizations for deliberate self-harm in a population aged 10 years and over. This study used the US National Inpatient Sample data and data from British Columbia; both data sources had E-code completeness rates above 85 %. The gold standard for deliberate self-harm was defined as hospitalizations with a diagnosis of E950-958. Patrick et al. found that an algorithm combining diagnoses for psychiatric disorders [including depression] and injury/poisoning can produce a positive predictive value as high as 87.8% for identifying hospitalizations for deliberate self-harm [with specificity of 99.4% and sensitivity of 57.3%]. In the context of our longitudinal study on the impact of Food and Drug Administration warnings on antidepressant use and subsequent suicidality in youth, using Patrick’s algorithm may introduce ascertainment bias because rates of depression diagnosis declined subsequent substantially after the warnings."
Speaking of bias, the references in that last sentence point to yet another set of studies [1], industry funded, from articles clearly attempting to discredit the Black Box Warning using a marketing research database. They said it again in their BMJ paper [3]:
Because previous studies showed that rates of depression diagnosis changed after the warnings and that outpatient claims are often incomplete for mental health conditions such as depression, to avoid introducing selection bias, we did not limit our cohorts to those with a coded diagnosis of depression.
And used Patrick et al’s validation numbers for the truncated algorithm as if they applied to their own data:
Non-fatal poisoning by psychotropic drugs [predominantly tranquilizers] has a positive predictive value of 79.7% for suicide attempts [sensitivity was 38.3% and specificity was 99.3%], outperforming other types of injuries or poisonings.
So Lu et al take the algorithm suggested as a suicide attempt proxy from two government datasets [2], modify it based on two other articles using data from a marketing research database [1], and then apply the modified algorithm to yet another dataset collected from an HMO consortium [3]. Further, while they validate that the E-Codes are unusable in their database, but don’t test their own database to confirm that there have been changes in the diagnosis of depression or other psychiatric conditions. Further do they didn’t check their own database to see if the the patients they identified with their jury-rigged algorithm were on antidepressants at the time they were identified. Nor did they show us the results of using Patrick et al’s full algorithm on their own data.

Even though the intervals vary, just looking at the age ranges [at the end of this post], it’s apparent that these are different datasets. Further, they were created for different reasons, from different sources, and their ontogeny can’t be vetted, at least by me. All those things heighten the need to reproduce the findings of others in their own database, something not very hard to do [and obvious]. I would suggest that this study is uninterpretable, and shouldn’t have been published in a mainstream journal as is, but rather sent back suggesting resubmission when they’d done their necessary homework. My broader point is that population studies are hard enough to interpret when done thoroughly. Making the assumption that the results from any given dataset can be applied to any other dataset freely is untenable [also obvious]. A simple rule for such studies:

all databases are not created equal…

[1]

Decline in treatment of pediatric depression after FDA advisory on risk of suicidality with SSRIs.
by Libby AM, Brent DA, Morrato EH, Orton HD, Allen R, Valuck RJ.
American Journal of Psychiatry. 2007 164[6]:884-891.
[full text online]
Persisting decline in depression treatment after FDA warnings.
Libby AM, Orton HD, Valuck RJ.
Archives of General Psychiatry. 2009 66[6]:633-639.
[full text online]
database
Data for this study come from the PHARMetrics Patient Centric Database, the largest national database of longitudinal integrated health care claims data commercially available from PHARMetrics, a Unit of IMS, Inc [llagerstown, Maryland], under unrestricted license. Data came from integrated medical, specialty, facility, and pharmacy-paid claims from more than 95 managed care plans nationally, representing more than 53 million covered patienls from January 1999 to December 2007. Patients were unidentified and anonymous; therefore, an ex- pedited review was obtained from the Colorado Multiple In- stitutional Review Board..
Identifying a cohort of new episodes of depression was the first step in building the analytic file. The definition of a new episode of depression was based on specifications of the National Committee for Quality Assurance Health plan and Hm- ployer Data and Information Set.’617 The resulting time hori- zon that accounted for episode creation, follow-up, and seasonality spanned from July 1999 to June 2007- The total cohort of 643 313 individual patients was separated into 3 depression cohorts comprising 792 807 episodes of diagnosis and possible treatment There were 91748 pediatric cases [aged 5-18 years at time of diagnosis], 70 311 young adult cases [aged 19-24 years at diagnosis], and 630 748 adult cases [aged 25-89 years at diagnosis]. Average ages for each cohort were 15 years for pediatric, 21 years for young adults, and 44 years for adults. Female patients accounted for roughly 60% of pediatric cases and 70% of adult cases; 4% of cases were receiving managed Medicaid benefits at the start of the episode.

[2]

Identification of hospitalizations for intentional self-harm when E-codes are incompletely recorded
by Amanda R. Patrick, Matthew Miller, Catherine W. Barber, Philip S. Wang, Claire F. Canning and Sebastian Schneeweiss
Pharmacoepidemiology and Drug Safety. 2010 19:1263–1275.
database
Data were derived from two large population-based hospital discharge abstract databases. We chose to use two different databases in the interest of gaining some insight regarding the generalizability of our findings. These data were drawn from two countries with different suicide rates, different practice patterns, and different hospital payment schemes, which may translate into differences in coding practice. In addition, there is variation in the number of diagnosis and procedure codes recorded and the availability of patient-level linkable data on prior inpatient hospitalizations, physician visits, and prescription drug use.
Data from British Columbia [BC], Canada, were obtained from the BC Ministry of Health. The database includes records of hospitalizations for all patients in BC’s publicly funded healthcare system. Data elements include patient age and sex, up to 25 diagnosis codes including E-codes, up to five procedure codes, length of stay, and discharge disposition. An evaluation of this database found good specificity and completeness of diagnosis codes. We used data from 1999 through 2001, a period immediately prior to the transition from ICD-9-CM to ICD-10-CA diagnosis codes in BC.
Data from the United States came from the Nationwide Inpatient Sample [NIS], a publicly available dataset designed to approximate a 20% representative sample of all non-federal hospitals in the United States. The NIS is produced by the Agency for Healthcare Research and Quality [AHRQ] from hospital inpatient discharge records submitted by state health data organizations. The 2003 NIS included data from 37 states. Data elements include hospital location [state], patient age, sex, and race, up to 15 diagnosis codes, up to 15 procedure codes, up to 4 E-codes, length of stay, primary payer, and discharge disposition.

[3]

Letter to the Editor
How complete are E-codes in commercial plan claims databases?
by Christine Y. Lu, Christine Stewart, Ameena T. Ahmed, Brian K. Ahmedani, Karen Coleman, Laurel A. Copeland. Enid M. Hunkeler, Matthew D. Lakoma, Jeanne M. Madden, Robert B. Penfold, Donna Rusinak, Fang Zhang, and Stephen B. Soumerai
[full text in a madness to our method – a new introduction… ]
Changes in antidepressant use by young people and suicidal behavior after FDA warnings and media coverage: quasi-experimental study
by Christine Y Lu, Fang Zhang , Matthew D Lakoma analyst, Jeanne M Madden, Donna Rusinak, Robert B Penfold, Gregory Simon, Brian K Ahmedani, Gregory Clarke, Enid M Hunkeler, Beth Waitzfelder, Ashli Owen-Smith, Marsha A Raebel, Rebecca Rossom, Karen J Coleman, Laurel A Copeland, Stephen B Soumerai
British Medical Journal. 2014 348:g3596.
[full text online]
database
This study included 11 geographically distributed US healthcare organizations that provide care to a diverse population of 10 million people in 12 states. All organizations are members of the Mental Health Research Network. a division of the larger HMO Research Network, an established consortium of 19 research centers affiliated with large not for profit integrated healthcare systems. Members are enrolled through employer sponsored insurance, individual insurance plans, and capitated Medicare and Medicaid programs. Members served by these systems are generally representative of each system’s geographic service area [see supplementary table A]…
To examine changes in suicide attempts after the warnings, we used the same denominator population as defined previously. While encounters for suicide attempts can be identified in administrative databases using external cause of injury codes [E-codes]], they are known to be incompletely captured in commercial plan databases. Our preliminary analysis found that E-code completeness varied across study sites, treatment settings, and years. Therefore, instead of deliberate self harm E-codes, we used poisoning by psychotropic agents [international classification of diseases, ninth revision, clinical modification 969], a more reliable proxy for population level suicide attempts. Poisoning by drugs or toxic substances is the most common method of suicide attempt leading to hospital admission and emergency room treatments. Non-fatal poisoning by psychotropic drugs [predominantly tranquilizers] has a positive predictive value of 79.7% for suicide attempts [sensitivity was 38.3% and specificity was 99.3%], outperforming other types of injuries or poisonings.

  1.  
    Steve Lucas
    July 3, 2014 | 2:58 PM
     

    If I get this right; we have testing to a surrogate end point, and using incomplete or inappropriate data, both are old pharma tricks.

    Steve Lucas

  2.  
    Bernard Carroll
    July 4, 2014 | 4:49 AM
     

    Winston Churchill might have said that the whole thing is an error wrapped in a derivative inside a narrative.

  3.  
    July 4, 2014 | 11:12 AM
     

    And I would add – and enclosed in a sound bite devoid of meaning.

  4.  
    July 4, 2014 | 1:58 PM
     

    Hmmm…a proxy 2 or 3 times removed is…proxitis?

Sorry, the comment form is closed at this time.