miles to go…

Posted on Saturday 8 October 2011

I like this article in the American Journal of Psychiatry even though I don’t understand its nuances. It’s a meta-analysis of the last ten years of all genetic studies where genotype and environment interactions have been the target [G×E]. An example "In 2003, Caspi and colleagues reported an increasingly positive relationship between number of self-reported stressful life events and depression risk among individuals having more short alleles at the serotonin transporter (5-HTTLPR) polymorphism." It is directed to exactly the kind of problem I’ve been fretting about – false positives.
A Critical Review of the First 10 Years of Candidate Gene-by-Environment Interaction Research in Psychiatry.
by Duncan LE and Keller MC.
American Journal of Psychiatry. 2011 168(10):1041-9.

Objective: Gene-by-environment interaction (G×E) studies in psychiatry have typically been conducted using a candidate G×E (cG×E) approach, analogous to the candidate gene association approach used to test genetic main effects. Such cG×E research has received widespread attention and acclaim, yet cG×E findings remain controversial. The authors examined whether the many positive cG×E findings reported in the psychiatric literature were robust or if, in aggregate, cG×E findings were consistent with the existence of publication bias, low statistical power, and a high false discovery rate.
Method: The authors conducted analyses on data extracted from all published studies (103 studies) from the first decade (2000-2009) of cG×E research in psychiatry
Results: Ninety-six percent of novel cG×E studies were significant compared with 27% of replication attempts. These findings are consistent with the existence of publication bias among novel cG×E studies, making cG×E hypotheses appear more robust than they actually are. There also appears to be publication bias among replication attempts because positive replication attempts had smaller average sample sizes than negative ones. Power calculations using observed sample sizes suggest that cG×E studies are underpowered. Low power along with the likely low prior probability of a given cG×E hypothesis being true suggests that most or even all positive cG×E findings represent type I errors.
Conclusions: In this new era of big data and small effects, a recalibration of views about groundbreaking findings is necessary. Well-powered direct replications deserve more attention than novel cG×E findings and indirect replications.
[ type I error = FALSE POSITIVE ]
I’m going to skip the mathematic and statistical points in the full article because I don’t understand them well enough to be able to comment, but their gross findings are clear. Almost all if not all reported G×E articles are likely false positives that have not been replicated is subsequent studies. They go on to show that these studies are usually underpowered [n too small] and that replication studies are probably biased towards false positivity as well. They suggest that there’s a large "publication bias" in this literature. So they urge caution in interpreting the results [which, by the way, have not been supported by the data from the genome-wide association study]:

What is a genome-wide association study? A genome-wide association study is an approach that involves rapidly scanning markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. Once new genetic associations are identified, researchers can use the information to develop better strategies to detect, treat and prevent the disease. Such studies are particularly useful in finding genetic variations that contribute to common, complex diseases, such as asthma, cancer, diabetes, heart disease and mental illnesses.
Their conclusion in the full text:
Despite numerous positive reports of cGxEs in the psychiatric genetics literature, our findings underscore several concerns that have been raised about the cGxE field in psychiatry. Our results suggest the existence of a strong publication bias toward positive findings that makes cGxE findings appear more robust than they actually are. Almost all novel results are positive, compared with less than one-third of replication attempts. More troubling is evidence suggesting that replication studies, generally considered the sine qua non of scientific progress, are also biased toward positive results. Furthermore, it appears that sample sizes for null replication results must be approximately six times larger than sample sizes for positive replication results in order to be deemed publishable on their own. Such a publication bias among replication attempts suggests that meta-analyses, which collapse across replication results for a given cGxE hypothesis, will also be biased toward being unrealistically positive. Although methods exist to detect publication biases (e.g., the funnel plot), they are not very sensitive, and correcting meta-analytic results for this bias is difficult. Finally, our findings suggest that meta-analyses using very liberal inclusion thresholds are virtually guaranteed to find positive results.
Here’s another study of the genetic associations reported for Major Depressive Disorder using the GWAS data:
Poor replication of candidate genes for major depressive disorder using genome-wide association data
by F J Bosker, C A Hartman, I M Nolte, B P Prins, P Terpstra, D Posthuma, T van Veen, G Willemsen, R H DeRijk, E J de Geus, W J Hoogendijk, P F Sullivan, B W Penninx, D I Boomsma, H Snieder and W A Nolen
Molecular Psychiatry 2011 16 :516-532.

Data from the Genetic Association Information Network (GAIN) genome-wide association study (GWAS) in major depressive disorder (MDD) were used to explore previously reported candidate gene and single-nucleotide polymorphism (SNP) associations in MDD. A systematic literature search of candidate genes associated with MDD in case–control studies was performed before the results of the GAIN MDD study became available. Measured and imputed candidate SNPs and genes were tested in the GAIN MDD study encompassing 1738 cases and 1802 controls. Imputation was used to increase the number of SNPs from the GWAS and to improve coverage of SNPs in the candidate genes selected. Tests were carried out for individual SNPs and the entire gene using different statistical approaches, with permutation analysis as the final arbiter. In all, 78 papers reporting on 57 genes were identified, from which 92 SNPs could be mapped. In the GAIN MDD study, two SNPs were associated with MDD: C5orf20 (rs12520799; P=0.038; odds ratio (OR) AT=1.10, 95% CI 0.95–1.29; OR TT=1.21, 95% confidence interval (CI) 1.01–1.47) and NPY (rs16139; P=0.034; OR C allele=0.73, 95% CI 0.55–0.97), constituting a direct replication of previously identified SNPs. At the gene level, TNF (rs76917; OR T=1.35, 95% CI 1.13–1.63; P=0.0034) was identified as the only gene for which the association with MDD remained significant after correction for multiple testing. For SLC6A2 (norepinephrine transporter (NET)) significantly more SNPs (19 out of 100; P=0.039) than expected were associated while accounting for the linkage disequilibrium (LD) structure. Thus, we found support for involvement in MDD for only four genes. However, given the number of candidate SNPs and genes that were tested, even those significant may well be false positives. The poor replication may point to publication bias and false-positive findings in previous candidate gene studies, and may also be related to heterogeneity of the MDD phenotype as well as contextual genetic or environmental factors.
The sequencing of the human genome was a beginning, not an ending. We don’t even yet know what we don’t know about it. The information encoded is at the most basic of levels – proteins, various RNAs and vast regions whose functions are uncharacterized. Our technology may be up to speed for determining paternity, ethnicity, or solving forensic cases, but the time for understanding something as complex as inheritable human subjectivity or responses to medications is still over the horizon, if it’s even there to be understood. 

I read these two articles as complaints – legitimate complaints. The genomic scientists are asking us to give them something more precise to work with – something better than "self-reported stressful life events" or the current muddy category of Major Depressive Disorder. I expect they’re going to complain even louder about the questions asked by iSpot or Trivedi’s NIMH study trying to pick the right antidepressant – a question loaded with unproven and unlikely assumptions. So far, we can’t even tell them that there is a differential response to study.
  1.  
    Ivan
    October 8, 2011 | 11:47 PM
     

    So, no more pulling rabbits out hats. It’s about time the just-so stories ended in psychiatric genetics.

Sorry, the comment form is closed at this time.