Basic approach#

  • Supervised learning with a qualitative or categorical response.

  • Just as common, if not more common than regression:


  1. Medical diagnosis: Given the symptoms a patient shows, predict which of 3 conditions they are attributed to.

  2. Online banking: Determine whether a transaction is fraudulent or not, on the basis of the IP address, client’s history, etc.

  3. Web searching: Based on a user’s history, location, and the string of a web search, predict which link a person is likely to click.

  4. Online advertising: Predict whether a user will click on an ad or not.


Bayes classifier#

  • Suppose \(P(Y\mid X)\) is known. Then, given an input \(x_0\), we predict the response

\[\hat y_0 = \text{argmax}_{\;y}\; P(Y=y \mid X=x_0).\]
  • The Bayes classifier minimizes the expected 0-1 loss:

\[ E\left[ \frac{1}{m} \sum_{i=1}^m \mathbf{1}(\hat y_i \neq y_i) \right]\]
  • This minimum 0-1 loss (the best we can hope for) is the Bayes error rate.


Basic strategy: estimate \(P(Y\mid X)\)#

  • If we have a good estimate for the conditional probability \(\hat P(Y\mid X)\), we can use the classifier:

\[\hat y_0 = \text{argmax}_{\;y}\; \hat P(Y=y \mid X=x_0).\]
  • Suppose \(Y\) is a binary variable. Could we use a linear model?

\[P(Y=1 | X) = \beta_0 + \beta_1X_1 + \dots+ \beta_1X_p \]
  • Problems:

    • This would allow probabilities \(<0\) and \(>1\).

    • Difficult to extend to more than 2 categories.