The Categorical Variables just get one post. It’s not because they aren’t important. It’s because I’ve already said what I’m up to in this series. And because they’re easy…

**METHOD 1: the spreadsheet**

I thought it would be easiest to start with the spreadsheet itself this time because the column headings make it easy to discuss the output. This spreadsheet has been added below the one discussed in john henry’s hammer: continuous variables I…, downloadable here. It’s shown above with the MADRS efficacy data from the two recent Brexpiprazole *Augmentation-in-Treatment-Resistant-Depression* studies [see a story: the beginning of the end…].

Categorical Variables show up in almost every Clinical Drug Trial. They are derived variables that hold binary [yes/no] data based on some specified criteria. They’re often named, though the criteria for the names vary from study to study. So RESPONSE might be *«final HAM-D score < 50% of the baseline»* or

*«final HAM-D score*. In this case, the data is usually reported in the articles – the tally of the yeses and nos. Unlike the Continuous Variables, defining the dataset with Summary Data requires no mathematical manipulations [MEAN, Standard Deviation, Standard Error of the Mean], all you have to do is count.

__<__50% of the baseline OR final HAM-D score__<__8»The classic statistical test for significance is *ChiSquare*, discussed in in the land of sometimes^{[2]}…. The spreadsheet supplies two measurements of EFFECT SIZE. The first, *Number Needed to Treat* [NNT], is the more intuitive. It’s exactly what its name says, how many patients you have to treat to get one that beats what you would’ve gotten with placebo. The second EFFECT SIZE index is the ODDS RATIO [OR]. It is a quantitative measure frequently reported in meta-analyses where multiple studies are compared. One thing to note, unlike Cohen’s d, the Odds Ratio is not centered between the the 95% Confidence Intervals, so it is often charted on a logarithmic scale [which "centers" the OR]. For the mathematically inclined, the formulas for these column are:

**if a=control**

_{[yes]}, b=control_{[no]}, c=drug_{[yes]}, & d=drug_{[no]}, then:**Control Response% = a÷(a+b)**

**Drug Response% = c÷(c+d)**

**ChiSquare = ((a×d-b×c)²×(a+b+c+d))÷((a+b)×(c+d)×(a+c)×(b+d)) [1**

*df*]**NNT = 1÷(c÷(c+d)-a÷(a+b))**

**OR = (c÷d)÷(a÷b)**

**OR**

_{[95%CI]}= Exp(Ln(OR)±1.96x√((1÷a)+(1÷b)+(1÷c)+(1÷d)))^{ Ln is the natural log value and Exp means raise to the power of e [take the antilog]}

**METHOD 2: The Internet Calculators**

*Chi Square*calculations, I use the one at VassarStats because it gives the classic

*Pearson’s Chi Square*, but it also calculates the Chi Square with the

*Yates correction*. In addition, it computes the

*Fisher Exact Probability Test*. These refinements are used in articles occasionally and the VassarStats page gives those results and links to an explanation of each [see Chapter 8 in their Web Textbook] so I don’t have to try and explain. And as for the

*Odds Ratio*and its

*95% Confidence Intervals*, Why not stick with a winner – VassarStats? This version of their Internet Calculator does both the

*Chi Square*and the

*Odds Ratio*with its

*95% Confidence Intervals*. One stop shopping! As for the

*NNT*, there are Internet Calculators around, but frankly it’s almost easier to do it in your head or using the calculator in your computer. Subtract the %yes of the control from the %yes of the drug and divide the result into 100. That’s all there is to it. So

**23.5%-14.7%=8.8%**then

**100÷8.8=11.27**.