2 minute read

Logistic Regression is a simple, yet powerful classification method. The question we will address in this article is: given a patient's temperature in Fahrenheit, determine if the patient has the flu or the common cold. Note that we are trying to predict a qualitative value (flu or cold) rather than quantitative (a number). Let us encode the two medical conditions as \(\{1,2\}\) for flu and cold respectively.

Logistic Function and Log Odds

In logistic regression, we model the probability of a patient having the flu given a temperature \(t\) with the logistic function, $$ Pr(Y=1\mid T=t) = \frac{e^{\beta_0+\beta_1t}}{1+e^{\beta_0+\beta_1t}}$$ For convenience, let us define \(p(t)=Pr(Y=1\mid T=t)\). With a bit of algebra, we can obtain the log odds or logit, $$\log{\frac{p(t)}{1-p(t)}} = e^{\beta_0+\beta_1t}$$ Using the log odds helps us interpret the relationship between \(\beta_1\) and \(T\). Specifically, increasing \(T\) by one unit changes the log odds (or logit) by \(\beta_1\). It is important to note that a one unit increase in \(T\) does not correspond to a \(\beta_1\) change in \(p(t)\). Rather, it depends on the current value of \(T\) since \(p(t)\) is a non-linear function.

Estimating Regression Coefficients

How do we estimate \(\beta_0\) and \(\beta_1\)? First, we need a dataset and typically a method known as maximum likelihood is used. In our case, a dataset would be a collection of different patient's temperatures and their respective medical conditions (flu or cold). Assume we have observations \(\{(t_1, y_1), (t_2, y_2) \ldots , (t_n, y_n)\}\) where \(t_i\) is patient \(i\)'s temperature and \(y_i \in \{1,2\}\) for flu and cold respectively. Then the likelihood function we want to maximize is, $$l(\beta_0,\beta_1)=\prod_{i:y_i=1} p(t_i) \prod_{i':y_{i'}=0}(1-p(t_i))$$ The Newton-Raphson algorithm is one option for solving this optimization problem.

Making Predictions

After estimating the coefficients, we can compute probabilities using the following equation and plugging in the estimated \(\beta\)s, $$\hat{p}(t)=\frac{e^{\hat{\beta}_0+\hat{\beta}_1t}}{1+e^{\hat{\beta}_0+\hat{\beta}_1t}}$$ For concreteness, let our estimated parameters for our logistic regression model to predict medical condition from temperature be \(\hat{\beta}_0=-29\), \(\hat{\beta}_1=.3\) and \(X=110\). The predicted probability of a patient having the flu given a temperature of 110 degrees Fahrenheit is, $$\hat{p}(X)=\frac{e^{-29+.3*110}}{1+e^{-29+.3*110}} = .98$$ Based on our logistic regression model, the probability of him having the flu is 98%!

Extensions to Multiple Predictors and Classes

We can generalize our formulation to allow for multiple predictors. Let \(X\) denote the predictors and G the response. The updated logistic function and log odds are, $$\log{\frac{p(X)}{1-p(X)}} = \beta_0 + \beta_1X + \ldots + \beta_pX_p$$ $$p(X)=\frac{e^{\beta_0+\beta_1X+\ldots + \beta_pX_p}}{1+e^{\beta_0+\beta_1X+\ldots +\beta_pX_p}}$$ We can also extend logistic regression to allow for more than two classes. In this setting, each class is specified by its own log odds or logit transformation, \begin{align} \log{\frac{Pr(G=1\mid X=x)}{Pr(G=K\mid X=x)}} & = \beta_{10}+\beta_1^Tx\\ \log{\frac{Pr(G=2\mid X=x)}{Pr(G=K\mid X=x)}} & = \beta_{20}+\beta_2^Tx\\ & \vdots\\ \log{\frac{Pr(G=K-1\mid X=x)}{Pr(G=K\mid X=x)}} & = \beta_{(K-1)0}+\beta_{K-1}^Tx\\ \end{align}