A confusion matrix is a table that is used for describing the performance of a classification model on a set of test data for which the true values are to be known.
Let’s understand with an example confusion matrix for a binary classifier :
What to know from this matrix?
- There are mainly two possible predicted classes either “yes” or “no”. If we were predicting the presence of a coronavirus, for example, “yes” would mean they have the coronavirus, and “no” would mean they don’t have the coronavirus.
- The classifier made a total of 290 predictions (e.g., 290 patients were being tested for the presence of that coronavirus).
- Out of those 290 cases, the classifier predicted “yes” 220 times, and “no” 70 times.
- In reality, 210 patients in the sample have the coronavirus, and 80 patients do not.
Let’s now define the most basic terms:
- true positives (TP): These are cases in which we predicted yes (they have the coronavirus), and they do have the coronavirus.
- true negatives (TN): We predicted no, and they don’t have the coronavirus.
- false positives (FP): We predicted yes, but they don’t actually have the coronavirus. (Also known as a “Type I error.”)
- false negatives (FN): We predicted no, but they actually do have the coronavirus. (Also known as a “Type II error.”)
Below there are terms to the confusion matrix, and I have also added the row and column totals:
Below there is a list of rates that are mainly computed from a confusion matrix for a binary classifier:
- Accuracy: Overall, how often is the classifier correct?
- (TP+TN)/total = (200+60)/290 = 0.89
- Misclassification Rate: Overall, how often is it wrong?
- (FP+FN)/total = (20+10)/290 = 0.10
- also known as “Error Rate”
- True Positive Rate: When it is yes, how often does it will predict yes?
- TP/actual yes = 200/210 = 0.95
- also known as “Sensitivity” or “Recall”
- False Positive Rate: When it is no, how often does it predict yes?
- FP/actual no = 20/80 = 0.25
- True Negative Rate: When it is no, how often does it predict no?
- TN/actual no = 60/80 = 0.75
- equivalent to 1 minus False Positive Rate
- also known as “Specificity”
- Precision: When it predicts yes, how often is it actually correct?
- TP/predicted yes = 200/220 = 0.91
- Prevalence: How often does the yes condition occur in our given sample?
- actual yes/total = 210/290 = 0.72
A couple of other terms are also worth mentioning:
- Null Error Rate: This is how you would be wrong if you always predicted the majority class. (In our example, the null error rate would be 80/290=0.28 because if you always predicted yes, you would only be wrong for the 80 “no” cases.) This can be a very useful baseline metric to compare your classifier against. However, the classifier which is best for a particular application will sometimes have a higher error rate than the null error rates.
- Cohen’s Kappa: This is a measure of how well the classifier performed as compared to how well it would have performed simply by chance. In other words, a model will contain a high Kappa score when there is a huge difference between the accuracy and the null error rate.
- F Score: This is considered as the weighted average of the true positive rate and precision.
- ROC Curve: This is a mainly used graph that summarizes the performance of a classifier over all possible thresholds. It is mainly generated by plotting the True Positive Rates (y-axis) against the False Positive Rates (x-axis)
Check Our Best Articles: