A friend sent me a book recently. It’s nice to be known so well, because it was totally up my alley. Rather than simply recommending it, I’ll say that I can’t imagine not reading it. It’s one of those about science books that succeeds in covering a complex topic without getting expectable mortals lost in the mathematics. In the course of things, it tells us something about a way we all think about and do science that is so ingrained that we don’t even know we’re doing it – we look at the little pictures. Then we extrapolate a little picture as a proxy for what really matters – the big picture.
The authors point out that our scientific methods have been designed to optimize the extrapolability [my word] of our small samples. But what if we had huge samples, closer to all the data instead of just some tiny corner of it? Then they roll out example after example where huge databases [Google, Amazon, Facebook, Tweets, machines with censors, emails, the NSA, etc. etc.]. are being analyzed looking for correlations. The examples are absolutely mind-blowing – worth the price of the book. There are paradigm shifts galore. Messy data with missing values? Fine. Outcomes derived by correlation rather than driven by hypotheses? Sure. Static answers? Better still, ongoing monitoring constantly updating the correlations. And I couldn’t read the book without thinking about our Clinical Drug Trials.
by Ruben G. Duijnhoven, Sabine M. J. M. Straus, June M. Raine, Anthonius de Boer, Arno W. Hoes, Marie L. De BruinPLoS Med. 2013;10:e1001407. doi: Epub 2013 Mar 19.
BACKGROUND: At the time of approval of a new medicine, there are few long-term data on the medicine’s benefit-risk balance. Clinical trials are designed to demonstrate efficacy, but have major limitations with regard to safety in terms of patient exposure and length of follow-up. This study of the number of patients who had been administered medicines at the time of medicine approval by the European Medicines Agency aimed to determine the total number of patients studied, as well as the number of patients studied long term for chronic medication use, compared with the International Conference on Harmonisation’s E1 guideline recommendations.METHODS AND FINDINGS: All medicines containing new molecular entities approved between 2000 and 2010 were included in the study, including orphan medicines as a separate category. The total number of patients studied before approval was extracted [main outcome]. In addition, the number of patients with long-term use [6 or 12 mo] was determined for chronic medication. 200 unique new medicines were identified: 161 standard and 39 orphan medicines. The median total number of patients studied before approval was 1,708 [interquartile range [IQR] 968-3,195] for standard medicines and 438 [IQR 132-915] for orphan medicines. On average, chronic medication was studied in a larger number of patients [median 2,338, IQR 1,462-4,135] than medication for intermediate [878, IQR 513-1,559] or short-term use [1,315, IQR 609-2,420]. Safety and efficacy of chronic use was studied in fewer than 1,000 patients for at least 6 and 12 mo in 46.4% and 58.3% of new medicines, respectively. Among the 84 medicines intended for chronic use, 68 [82.1%] met the guideline recommendations for 6-mo use [at least 300 participants studied for 6 mo and at least 1,000 participants studied for any length of time], whereas 67 [79.8%] of the medicines met the criteria for 12-mo patient exposure [at least 100 participants studied for 12 mo].CONCLUSIONS: For medicines intended for chronic use, the number of patients studied before marketing is insufficient to evaluate safety and long-term efficacy. Both safety and efficacy require continued study after approval. New epidemiologic tools and legislative actions necessitate a review of the requirements for the number of patients studied prior to approval, particularly for chronic use, and adequate use of post-marketing studies.
But I’m making up things here, not really making proposals. That’s one of the strong points the book makes very explicit. The kind of people who do this kind of Big Data analyses aren’t people like you and me. Experience is, in fact, a handicap. Old science types like us are too stuck in our traditions. The greater our science/data skills, the less creative and more bound by the concepts of hypotheses and tidy data. It’s the young ones with baggy jeans and over·sized hoodies scooting around on skateboards tweeting and texting that rule in the world of Big Data. They don’t need to make theoretical sense out of correlations, just find them. They make better spell·checkers and language·translators by analyzing Internet chatter than an army of grammarians and linguists. They recommend books we’ll like from sales record correlations far better than any book reviewers. In fact, Amazon has shut down its book review program altogether.