Machine Learning: A simple logistic regression model in Python


What is logistic regression?

Logistic, or logit regression is the type of regression that we use when the dependent variable (the thing we’re trying to predict) is dichotomous, in other words, binary. For example, a logistic regression model may estimate pass/fail; yes/no or healthy/sick.

Widely adopted use cases of a logistic regression model have been to predict whether email is spam or to determine whether a tumor is malignant or not.

There are multiple types of logistic regression:

  1. Binary logistic regression – which means the model only has two possible outcomes, for example, happy or sad.
  2. Multinominal logistic regression is another logistic model which gives us three or more outcomes. For example, does a user prefer: pizza, pasta or curry. These are not ordered.
  3. Ordinal logistic regression gives us three or more outcomes. but they’re ordered – e.g. customer experience rated between 1 and 5 or likelihood to buy again: low, medium and high.

An example:

The below is a logistic regression model, which uses some dummy data to determine whether people are at risk of diabetes or not – of course, this model couldn’t actually determine whether of not someone does have diabetes, it’s just a demonstration of a very simple logistic implementation.