Photo by Elijah Hail on Unsplash

ROC & AUC Demystified

Don’t panic over these scary-looking concepts if you’re approaching them for the first time; I am pretty sure you’ll rate yourself higher at these concepts if you get along.

Gaurav Kamble
8 min readJul 7, 2021

--

NOTE :

Before moving ahead, I highly recommend you to watch my article on Confusion Matrix to understand these concepts in a better way.

ROC = Receiver Operating Characteristics

AUC = Area Under the Curve

ROC-AUC, drawn with the help of True Positive Rate (TPR) & False Positive Rate (FPR), is a graphical way to evaluate the performance of an ML model that is used to solve a classification problem. Although this metric can only be derived for binary classes, I will show you how ROC-AUC can be implemented for Multi-Class Classification as well.

I will cover the following topics in this article:

  1. What is ROC & AUC?
  2. Graphs for different values of ROC-AUC
  3. ROC-AUC for Binary Classification with a real-life use case
  4. ROC-AUC for Multi-Class Classification

What is ROC-AUC?

Receiver Operating Characteristics (ROC): as the name suggests, it tells us about the characteristics (qualities) of the operating conditions of the receiver. In Machine Learning, these characteristics are referred to as the performance of a classifier. ROC curve is a probability curve (a curve that describes the distribution of probability over the values of a random variable) plotted with False Positive Rate (FPR) on the X-axis and True Positive Rate (TPR) on the Y-axis.

TPR = Recall = Sensitivity

Fig 1: TPR

FPR = 1 — Specificity

Fig 2: Specificity
Fig 3: FPR

AUC is the Area Under the ROC Curve. It shows the strength/intensity with which our model is able to classify the output classes correctly. The higher this area, the higher will be the ability of our model to classify the outputs correctly.

NOTE:

ROC is a Curve and AUC is the Area Under that Curve

No curve can be drawn using just a single point. Respecting this rule, the ROC curve is drawn through multiple classification threshold points, each corresponding to a distinct value of TPR & FPR. AUC is the area that is contained between this ROC curve and the horizontal axis (X-axis). With that being said, remember that ROC-AUC doesn’t exist for just a single value of classification threshold, but for multiple ones.

Graphs for different values of ROC-AUC

To relate the above-mentioned theory with your visuals, the following are some of the graphs with different values of ROC and AUC:

1. AUC = 1

Fig 4: AUC = 1
  • Getting an AUC score of 1 is similar to achieving 100% Accuracy.
  • The performance of our model just can’t be better than this.
  • In this situation, applying a classification threshold of 0.5 will give us a perfect model, which will be able to map all the 0s to 0s and all the 1s to 1s.

2. AUC = 0.7

Fig 5: AUC = 0.7
  • A classifier with an AUC score of 0.7 claims that it is able to classify 70% of the total records correctly, but failed to deliver for rest 30% of the records.
  • Notice in the figure at your right that some of the values below the classification threshold of 0.5 which correspond to the -ve class are predicted as +ve. A similar phenomenon for the +ve class as well.
  • An AUC score of 0.7 is still quite a fine achievement.

3. AUC = 0.5

Fig 6: AUC = 0.5
  • An AUC score of 0.5 is probably the worst thing we need to have for our classifier.
  • It says that the model has absolutely no clue which record it is assigning to which output class.
  • The figure on the right has some visuals for this situation.

4. AUC = 0

Fig 6: AUC = 0
  • One way to think for a metric of AUC = 0 is that our model is performing the worst, as it is mapping all the 0s to 1s and all the 1s to 0s; a different perspective would be that our model is able to make Perfect inverse predictions.
  • A hack would be instead of fine-tuning and retraining the model, we can just assign 0 to the +ve class and 1 to the -ve class.

ROC-AUC for Binary Classification with a real-life use case

Consider a real-life use-case where we need to predict whether a person is affected with Cancer or not. In this case, we have 2 output classes : 0 (-ve class) and 1 (+ve class)

After training the classifier on the dataset, we get an AUC score of 0.7. The ROC-AUC graph for this classifier can be illustrated in the figure below:

Fig 7: Threshold = 0.5

Generally, the threshold for classifying the output classes is 0.5 i.e.,

If y <= 0.5, then y corresponds to class 0, and

If y > 0.5, then y corresponds to class 1

Where, y = output from a binary classifier

But in some cases, we might need to modify this threshold of classification.

Following are 2 cases where this modification of threshold to classify the outputs can be very helpful.

CASE 1:

Consider a situation where we need to tell a patient that he/she is affected with Cancer only when we are confident enough; as telling someone that he/she has Cancer might disturb that person’s mental state.

In order to do this, we need to increase our classification threshold from 0.5 to 0.7 i.e.,

If y <= 0.7, then the patient is not affected with Cancer, and

If y > 0.7, then the patient is affected with Cancer

The ROC-AUC graph associated with this situation can be visualized from the figure below:

Fig 8: Threshold = 0.7

Increasing the classification threshold from 0.5 to 0.7 will reduce our TPR as we are classifying some of the patients who probably have been affected with Cancer (within the range of 0.5–0.7) as not affected. In other words, we are classifying some of the +ve records as -ve, and doing so will affect the True Positive Rate (TPR) because we are increasing the no. of FN values; and according to the formula to calculate the TPR, increasing the no. of FN values will reduce the TPR.

The False Positive Rate (FPR) will not be affected as for all the actually -ve records, we will still be predicting them as well as before when the classification threshold was 0.5 because the actually -ve records lie below 0.5 and our current classification threshold is 0.7.

Also, the formula to calculate the FPR does not depend upon the FN values; hence the FPR won’t be affected by any change in the count of FN values.

In this case, we need to compromise on the TPR, for increasing the threshold of classification.

CASE 2:

Now consider another situation where we need to tell a patient that he/she is affected with Cancer, even if he/she has slight symptoms of cancer. In this case, we do not want to be reckless because neglecting even slight symptoms of a dreadful disease like Cancer can be very dangerous.

In order to do this, we need to lower our classification threshold from 0.5 to 0.3 i.e.,

If y <= 0.3, then the patient is not affected with Cancer and,

If y > 0.3, then the patient is affected with Cancer

The ROC-AUC graph corresponding to this situation can be visualized in the figure below:

Fig 9: Threshold = 0.3

Reducing the classification threshold from 0.5 to 0.3 will increase our False Positive Rate (FPR) as we are classifying some of the patients at low risk who might not be having Cancer (i.e. patients who are in the range of 0.3 to 0.5) as patients who have Cancer. In other words, we are classifying some of the -ve records as +ve, and doing so will affect the False Positive Rate (FPR) because we are increasing the no. of FP values; and according to the formula to calculate the FPR, increasing the no. of FP values will increase the FPR.

The True Positive Rate (TPR) will not be affected as for all the actually +ve records, we will still be predicting them as well as before when the classification threshold was 0.5 because the actually +ve records lie beyond 0.5 and our current classification threshold is 0.3.

Also, the formula to calculate the TPR does not depend upon the FP values; hence the TPR won’t be affected by any change in the count of FP values.

In this case, we need to compromise on the FPR, for reducing the threshold of classification.

In this way, we can plot a graph of TPR vs FPR for varying classification thresholds, which will produce a curve called the Receiver Operating Characteristics (ROC) curve; this curve will tell us choosing what threshold value will give us what value of TPR & FPR. The Area Under the Curve (AUC) will tell us how capable enough is our model to distinguish between the output classes correctly.

Now depending upon our compromise factor i.e. whether we want to compromise on TPR or FPR, we need to select the optimum threshold value for our problem statement.

ROC-AUC for Multi-Class Classification

Yes, it is true that ROC-AUC can only be derived for binary classes, but it can also be calculated for a multi-class classifier. This can be done with the help of the ‘One vs All’ method.

This method is similar to the method used to solve multi-class classification problems using Logistic Regression.

Suppose we need to solve a multi-class classification problem with 4 output classes: A, B, C & D

ROC-AUC will first be calculated by considering class A as one class and the rest of all the classes: B, C & D will be combined together and considered as a whole another class.

In the next iteration, class B will be considered as one class, and the rest of all the classes: A, C & D will be combined together and considered as a whole another class.

The same procedure will be repeated for class C & class D.

I hope this explanation solves the mystery of ROC-AUC for you! If this article was useful to you, please let me know. Also, if there are any questions or doubts, I am eager to solve them as well.

Happy Learning!

You can contact me here :

E-mail : gauravkamble9@gmail.com

LinkedIn : https://www.linkedin.com/in/gaurav-kamble-data-science-101

GitHub : https://github.com/GauravK1997

--

--

Gaurav Kamble

An Aspiring Data Scientist sharing his knowledge to clear the hurdles I faced during my journey of Data Science. Happy Learning!