Logistic Regression

2 minute read

Logistic Regression is a simple, yet powerful classification method. The question we will address in this article is: given a patient's temperature in Fahrenheit, determine if the patient has the flu or the common cold. Note that we are trying to predict a qualitative value (flu or cold) rather than quantitative (a number). Let us encode the two medical conditions as $\{1,2\}$ for flu and cold respectively.

Logistic Function and Log Odds

In logistic regression, we model the probability of a patient having the flu given a temperature $t$ with the logistic function, $$ Pr(Y=1\mid T=t) = \frac{e^{\beta_0+\beta_1t}}{1+e^{\beta_0+\beta_1t}}$$ For convenience, let us define $p(t)=Pr(Y=1\mid T=t)$. With a bit of algebra, we can obtain the log odds or logit, $$\log{\frac{p(t)}{1-p(t)}} = e^{\beta_0+\beta_1t}$$ Using the log odds helps us interpret the relationship between $\beta_1$ and $T$. Specifically, increasing $T$ by one unit changes the log odds (or logit) by $\beta_1$. It is important to note that a one unit increase in $T$ does not correspond to a $\beta_1$ change in $p(t)$. Rather, it depends on the current value of $T$ since $p(t)$ is a non-linear function.

Estimating Regression Coefficients

How do we estimate $\beta_0$ and $\beta_1$? First, we need a dataset and typically a method known as maximum likelihood is used. In our case, a dataset would be a collection of different patient's temperatures and their respective medical conditions (flu or cold). Assume we have observations $\{(t_1, y_1), (t_2, y_2) \ldots , (t_n, y_n)\}$ where $t_i$ is patient $i$'s temperature and $y_i \in \{1,2\}$ for flu and cold respectively. Then the likelihood function we want to maximize is, $$l(\beta_0,\beta_1)=\prod_{i:y_i=1} p(t_i) \prod_{i':y_{i'}=0}(1-p(t_i))$$ The Newton-Raphson algorithm is one option for solving this optimization problem.

Making Predictions

After estimating the coefficients, we can compute probabilities using the following equation and plugging in the estimated $\beta$s, $$\hat{p}(t)=\frac{e^{\hat{\beta}_0+\hat{\beta}_1t}}{1+e^{\hat{\beta}_0+\hat{\beta}_1t}}$$ For concreteness, let our estimated parameters for our logistic regression model to predict medical condition from temperature be $\hat{\beta}_0=-29$, $\hat{\beta}_1=.3$ and $X=110$. The predicted probability of a patient having the flu given a temperature of 110 degrees Fahrenheit is, $$\hat{p}(X)=\frac{e^{-29+.3*110}}{1+e^{-29+.3*110}} = .98$$ Based on our logistic regression model, the probability of him having the flu is 98%!

Extensions to Multiple Predictors and Classes

We can generalize our formulation to allow for multiple predictors. Let $X$ denote the predictors and G the response. The updated logistic function and log odds are, $$\log{\frac{p(X)}{1-p(X)}} = \beta_0 + \beta_1X + \ldots + \beta_pX_p$$ $$p(X)=\frac{e^{\beta_0+\beta_1X+\ldots + \beta_pX_p}}{1+e^{\beta_0+\beta_1X+\ldots +\beta_pX_p}}$$ We can also extend logistic regression to allow for more than two classes. In this setting, each class is specified by its own log odds or logit transformation, \begin{align} \log{\frac{Pr(G=1\mid X=x)}{Pr(G=K\mid X=x)}} & = \beta_{10}+\beta_1^Tx\\ \log{\frac{Pr(G=2\mid X=x)}{Pr(G=K\mid X=x)}} & = \beta_{20}+\beta_2^Tx\\ & \vdots\\ \log{\frac{Pr(G=K-1\mid X=x)}{Pr(G=K\mid X=x)}} & = \beta_{(K-1)0}+\beta_{K-1}^Tx\\ \end{align}

Twitter Facebook LinkedIn

Logistic Regression

Logistic Function and Log Odds

Estimating Regression Coefficients

Making Predictions

Extensions to Multiple Predictors and Classes

You May Also Enjoy

word2vec

Metropolis Hastings

The Gospel according to Matthew

Rejection Sampling