personalized medicine: the shoals of fuzzy math…

Posted on Saturday 23 April 2011

This business of "biomarkers" is a slippery slope. I think in the back of my mind, I’ve always retained an envy for my old medical specialty – rheumatology. Like Psychiatry, the disorders are heavily weighted towards ‘diseases of unknown etiology’ diagnosed with ‘criteria.’ But as a rheumatologist, we had biomarkers [LE prep, ANA, RA, sed rate, serum complement]. People still argue about what they mean, but they were there. They said "disease" in neon lights. "Biomarkers" in psychiatry have always been an elusive dream. They mean more than just as diagnostic tools. It’s as if they would ‘legitimize’ us – give us something tangible. And it must be even stronger for the "medical model" people who built the DSMs. So this business of personalized medicine has a back story as well – an added dimension. When I read that article appended to the end of my last post last night, I was thinking that:
    A tentative integrative model showed that a combination of N1 amplitude at Pz and verbal memory performance accounted for the largest part of the explained variance. These markers may serve as new biomarkers suitable for the prediction of antidepressant treatment outcome.
…just wasn’t going to get it. And listening to the speakers at the Mayflower Action Group Initiative talk in the definite about things that fit better in the maybe or, at best, tentative gave me the feeling, "Uh Oh. There they go again." I was pleased to see that more rational forces were at work in our universe when SteveBMD sent us this:
Poor replication of candidate genes for major depressive disorder using genome-wide association data
by F J Bosker, C A Hartman, I M Nolte, B P Prins, P Terpstra, D Posthuma, T van Veen, G Willemsen, R H DeRijk, E J de Geus, W J Hoogendijk, P F Sullivan, B W Penninx, D I Boomsma, H Snieder and W A Nolen
Molecular Psychiatry [2011] 16, 516–532.

Abstract:
Data from the Genetic Association Information Network [GAIN] genome-wide association study [GWAS] in major depressive disorder [MDD] were used to explore previously reported candidate gene and single-nucleotide polymorphism [SNP] associations in MDD. A systematic literature search of candidate genes associated with MDD in case–control studies was performed before the results of the GAIN MDD study became available. Measured and imputed candidate SNPs and genes were tested in the GAIN MDD study encompassing 1738 cases and 1802 controls. Imputation was used to increase the number of SNPs from the GWAS and to improve coverage of SNPs in the candidate genes selected. Tests were carried out for individual SNPs and the entire gene using different statistical approaches, with permutation analysis as the final arbiter. In all, 78 papers reporting on 57 genes were identified, from which 92 SNPs could be mapped. In the GAIN MDD study, two SNPs were associated with MDD: C5orf20 [rs12520799; P=0.038; odds ratio [OR] AT=1.10, 95% CI 0.95–1.29; OR TT=1.21, 95% confidence interval [CI] 1.01–1.47] and NPY [rs16139; P=0.034; OR C allele=0.73, 95% CI 0.55–0.97], constituting a direct replication of previously identified SNPs. At the gene level, TNF [rs76917; OR T=1.35, 95% CI 1.13–1.63; P=0.0034] was identified as the only gene for which the association with MDD remained significant after correction for multiple testing. For SLC6A2 [norepinephrine transporter [NET]] significantly more SNPs [19 out of 100; P=0.039] than expected were associated while accounting for the linkage disequilibrium [LD] structure. Thus, we found support for involvement in MDD for only four genes. However, given the number of candidate SNPs and genes that were tested, even these significant may well be false positives. The poor replication may point to publication bias and false-positive findings in previous candidate gene studies, and may also be related to heterogeneity of the MDD phenotype as well as contextual genetic or environmental factors.
Looking at the original full text version, I was impressed that they did it right – and used the honest statistics to correct for the fact that they had a lot of independent variables. The new technologies, particularly the DNA/SNP/Gene technologies offer us a fascinating field to play on, but as I’ve been writing about, they open the door to an area where the sloppy science of recent years could have a field day in the search for a way to continue Pharma’s pressure to increase sales. The possibility of expensive tests to find your genotype/phenotype in order to pick a personalized designer anti-whatever drugs seems to be drawing together a coalition of formidable forces. I’m glad to see that there are watchdogs who have "Data from the Genetic Association Information Network [GAIN] genome-wide association study [GWAS] in major depressive disorder [MDD]" to keep people honest.

I wanted to say something about corrections of data, because it’s a key point in these studies that survey a lot of SNPs [Single-nucleotide polymorphism] in the search for a genetic biomarker, or for that matter any biomarker. In applying statistics to data, the ubiquitous p value actually makes a statement that’s pretty simple to understand.
    p<0.05 says that the observed difference could only occur by chance one time in twenty.
    p<0.01 says that the observed difference could only occur by chance one time in a hundred.
    p<0.001 says that the observed difference could only occur by chance one time in a thousand.
But what if you test 50 different SNPs? Using a standard of p<0.05 for significance, you’d expect two or three to be "significant" just by chance which is of course absurd. How do you define real significance in that situation? There is a correction called the Bonferroni Correction that’s often used. It’s logic is fairly straightforward:
    The Bonferroni correction is derived by observing Boole’s inequality. If you perform n tests, each of them significant with probability β, (where β is unknown) then the probability that at least one of them comes out significant is [by Boole’s inequality] ≤ n⋅β. Now we want this probability to equal α, the significance level for the entire series of tests. By solving for β, we get β = α/n.
So when somebody goes on a fishing expedition and assays a bunch of SNPs looking for a biomarker, the p value should be divided by the number of SNPs [or other variables] tested [n]. So:
  Bonferroni Correction
  n=1 n=10 n=20 n=50 n=100
 
p<0.05 p<0.05 p<0.005 p<0.0025 p<0.001 p<0.0005
p<0.01 p<0.01 p<0.001 p<0.0005 p<0.0002 p<0.0001
p<0.001 p<0.001 p<0.0001 p<0.00005 p<0.00002 p<0.00001

There are other ways of correcting things, but the results are similar. The point is not to understand funny-named statistics or even to understand the math. The point is that there’s a wide open door to fudge results when people are screening a lot of potential biomarkers looking for the elusive pot of gold. If the correction method isn’t mentioned, be very suspicious that trickery is afoot. In this abstract, this is how this important part reads:
    At the gene level, TNF [rs76917; OR T=1.35, 95% CI 1.13–1.63; P=0.0034] was identified as the only gene for which the association with MDD remained significant after correction for multiple testing.
There’s a lot of fuzzy math going around these days…
  1.  
    April 23, 2011 | 6:13 AM
     

    This problem is pervasive across disciplines. By creating this ‘virtual world’ of math the delineation between reality and model (i use model as an all-inclusive name for anything involving math to represent reality) is getting fuzzy itself.

    At a certain point a certain model gets to be accepted as being the real thing, and one starts to extrapolate from there resulting not so much in fuzzy maths but just a incorrect model creating a fuzzy representation of reality.

    From quantum physics to fMRI all depend on mathematical representations without much foothold in reality since one can’t really verify it or reality is just not unequivocal.

    In the end, as soon as the math involved reaches a level of complexity, errors aren’t likely but certain. These discrepancies then get stopgap filled with yet other math which retrogradely adapts the original.

    A good example is in astrophysics. From just an expanding ball of energy from a focal point one now is at a point one needs at least 2 deus ex machina to make the theory fit. Dark matter and energy. Maybe correct, maybe just a result of a faulty basic model.

    At that level it doesn’t matter much one way or the other hoever in things such genetics it matters a lot.

  2.  
    Squamous
    April 24, 2011 | 8:30 AM
     

    So…am I correct that for the TNF example the given significance at 0.0034 isn’t really significant as the corrected p value of 0.00087 for 57 genes should apply as the true standard of significance? Or did the authors correct for that already?

Sorry, the comment form is closed at this time.