# Classification Evaluation Metrics Part-1

Priyabrata Panda Oct 06 2020 · 3 min read

### Introduction

Most of the machine learning problems are classified  into two types Supervised and Unsupervised learning. In case of Supervised learning there are two types of problems we mostly encounter Regression and Classification. In case of Regression dependent variable is a continuous  one. But in case of Classification our dependent variable is a categorical (class) one. In Classification there are many type of classification such as Binary classification, Multiclass classification and Multilabel Classification. When we train a classifier on a particular dataset we need an evaluation metrics in order to evaluate how good our model performing on that particular dataset. As there are lots of evaluation metrics present it becomes difficult to choose the write one in these series of blogs , I am going cover each and every evaluation metrics in detail

#### What  are we going to learn?

• Confusion Matrix
• Accuracy Score
• Precision Score
• Recall Score
• ### Confusion Matrix

Confusion Matrix is most popular ,simple but effective evaluation metric in classification problem. To Understand the confusion matrix let's consider a simple binary classification problem.

``````y_true=[1,0,1,0,0,0,1,1,0,0,1,1,0,1,0,1,1,0,1,1] #Our actual value
y_pred=[1,0,0,1,0,0,1,1,0,0,1,1,0,0,1,1,1,1,0,0] #Predicted value``````

In order to get confusion matrix we will use confusion_matrix function of sklearn.metrics module

``````from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_true,y_pred))``````

Let's understand the matrix bit by bit

The columns of confusion represent the predicted value and row represents actual value. The top left represents True Negative T.N means our model able to classify negative class  as negative class. The right bottom represents True Positive (T.P) means this many instances are correctly classified as Positive

The Top Right cell represents False Positive (F.P) means the instances are belong to -ve class but our model predicts these as Positive. It's also know as Type-I error

The Left Bottom represents False Negative (F.N) means the instances are positive  but our model predicts these as negative. It's is also known as Type-II error

Let's see the confusion matrix for multiclass classification

``````y_true=["cat","dog","cow","cat","cow","dog","cow","cat","dog","cat","cow","dog","cat","dog","cat","cow"]
y_pred=["cat","cow","cow","dog","cow","cat","cow","dog","dog","cow","dog","dog","cow","dog","cow","cow"]
confusion_matrix(y_true,y_pred,labels=["cat","cow","dog"])``````

In much more details

### Accuracy Score

It is the ratio of total number of correct prediction to total number of prediction

``````y_true=[1,0,1,0,0,0,1,1,0,0,1,1,0,1,0,1,1,0,1,1] #Our actual value
y_pred=[1,0,0,1,0,0,1,1,0,0,1,1,0,0,1,1,1,1,0,0] #Predicted value
accuracy_score(y_true,y_pred)
Ans.0.65``````

Alert!! Sometimes Accuracy score may be deceptive (in case of imbalanced dataset)  to know more about it check the blog How to handle Imbalanced Dataset

### Precision Score

It is the accuracy of positive prediction, means the ability of the classifier not to classify a negative sample as positive

``````y_true=[1,0,1,0,0,0,1,1,0,0,1,1,0,1,0,1,1,0,1,1] #Our actual value
y_pred=[1,0,0,1,0,0,1,1,0,0,1,1,0,0,1,1,1,1,0,0] #Predicted value
precision_score(y_true,y_pred)
Ans.0.7``````

Use Cases: This is important  in case movies classification either children can watch it or not. because you don't want to recommend a adult movie to a child .So you want to minimise false positive as much as possible

As far we discussed only about precision score of binary classification .In case of Multi class classification things become little tricky. There are  three types of precision score 1.Macro, 2.Micro,3.Weighted

Macro precision score: First it calculates  precision of each class and then it evaluates the unweighted mean (A.M) of these precision score

``````y_true=["cat","dog","cow","cat","cow","dog","cow","cat","dog","cat","cow","dog","cat","dog"]
y_pred=["cat","cow","dog","dog","cow","cat","cow","dog","dog","cow","dog","dog","cow","dog"]
precision_score(y_true,y_pred,average="macro")
Ans.0.4428571428571428``````

Weighted precision score: First it calculates precision of each class then it evaluates weighted mean of these precision scores (sum(score*no.intsances)/total no of instances)

``````y_true=["cat","dog","cow","cat","cow","dog","cow","cat","dog","cat","cow","dog","cat","dog"]
y_pred=["cat","cow","dog","dog","cow","cat","cow","dog","dog","cow","dog","dog","cow","dog"]
precision_score(y_true,y_pred,average="weighted")
#Ans.0.44591836734693874``````

### Recall Score

It is the ability of a classifier to classify positive instance as positive, mathematically represented as

Use Cases: It is used in diseases detection because you don't want to predict the person of not having diseases but actually the person having diseases. Basically you want to reduce False negative instances

There are also three types of recall score as like precision in case of Multiclass classification

``````y_true=["cat","dog","cow","cat","cow","dog","cow","cat","dog","cat","cow","dog","cat","dog"]
y_pred=["cat","cow","dog","dog","cow","cat","cow","dog","dog","cow","dog","dog","cow","dog"]
print("The weighted recall score is ",recall_score(y_true,y_pred,average="weighted"))
print("The macro recall score is ",recall_score(y_true,y_pred,average="macro"))
print("The micro recall score is ",recall_score(y_true,y_pred,average="micro"))``````