Skip to content

Naive Bayes Classifier in machine learning- All you need to know

  • 6 min read
Naive Bayes Classifier in machine learning

Classification machine learning algorithms use probability to predict the correct class, such as Logistic regression, SVC, decision tree classifier, Lasso regression, etc. Naive Bayes is the supervised machine learning algorithm used for classification problems. Let’s see how it works.

Principle of Naive Bayes classifier

The Naive Bayes classifier is one of the fastest and simplest algorithms in machine learning. The Naive Bayes classifier works on the Bayes theorem. The Bayes theorem is as follows:

Bayes formula

The Bayes theorem works on conditional probability. We can derive the probability happening of event A if B has already occurred. The A is called the hypothesis and the B is called the evidence. One assumption is made here that all features or predictors are independent of each other. So one feature can not affect another. Hence it is called naive.

Suppose here X is over independent features and y whether to play golf or not.

So the X will be,

By applying the chain rule, we can get,

Here the below part P(x1)P(x2)…P(xn) will be the same for every classification problem. and the conditional probability will be directly proportional to the upper part of the equation.

If we simplify the last formula then,

By Naive Bayes, we are calculating the probability of each target and at last, we have to select the class which has the highest probability among other classes. Therefore we have to max value from the formula.

simplified Naive Bayes theorem formula

Example of Naive Bayes classifier

NoOutlookTemperaturePlay Golf
0RainyHotNo
1RainyHotNo
2OvercastHotYes
3SunnyMildYes
4SunnyCoolYes
5SunnyCoolNo
6OvercastCoolYes
7RainyMildNo
8RainyCoolYes
9SunnyMildYes
10RainyMildYes
11OvercastMildYes
12OvercastHotYes
13SunnyMildNo

Here the 2 features Outlook and Temperature are given and we have to predict on the basis of these features like if we can play golf or not.

We have to make a frequency table and likelihood table for each feature.

Frequency table for weather:

WeatherYesNoP(Yes)P(No)
Rainy232/93/5
Sunny323/92/5
Overcast404/90/5
Total95

The above table is very much important to derive the conditional probability. From the table, if you are asked Probability of Yes with respect to Sunny P(Sunny/Yes) will be 2/9.

Now we have to create the table just as above for the Temperature feature also.

TemperatureYesNoP(Yes)P(No)
Hot222/92/5
Mild424/92/5
Cold313/91/5
Total95

Now last we will calculate the probability for the target column which is Play Golf.

Play GolfP(Yes)P(No)
Yes/No9/145/14

And That’s it. Now you can have all conditional probabilities that are required in Bayes theorem.

Now let’s take one example. If you have given that the weather is Sunny and the temperature is hot then it is good to play golf or not?

You have to calculate a probability for Yes and a probability for No.

P(Yes | Sunny, Hot)=P(Yes) * P(Sunny/Yes)* P(Hot/Yes)

=9/14*3/9*2/9

= 1/21

= 0.04

P(No| Sunny, Hot)=P(No) * P(Sunny/No)* P(Hot/No)

=5/14*2/5*2/5

== 0.05

Now we have to find the percentage of each case.

Percentage of P(Yes | Sunny, Hot)= 0.04/0.04+0.05*100=44%

Percentage of P(No| Sunny,Hot)= 0.05/0.04+0.05*100=55%

Now you can see. Here the probability of No is higher than Yes. So if the weather is Sunny and Hot then the answer will be No. So that’s how the naive Bayes classifier works.

The naive Bayes classifier is affected if you have an imbalanced dataset. There are methods to resolve that like upsampling and downsampling. To know more about how to deal with an imbalanced dataset click here.

Implementation of Naive Bayes Classifier in Python

You can take any dataset to fit the Naive Bayes Classifier. Here we took the titanic dataset. Naive Bayes classifier is also in the sklearn library just like other algorithms.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB

df = pd.read_csv("titanic.csv")
#Preprocessing
df.drop(['PassengerId','Name','SibSp','Parch','Ticket','Cabin','Embarked'],axis=1,inplace=True)
X = df.drop('Survived',axis=1)
y = df.Survived
dummies = pd.get_dummies(X.Sex)
X = pd.concat([X,dummies],axis=1)
X.drop(['Sex','male'],axis=1,inplace=True)
X["Age"] = X.Age.fillna(X["Age"].mean())

#splitting data for training and testing
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3)

#Fitting GaussianNB because it has continous values in some features
model = GaussianNB()
model.fit(X_train,y_train)
print(model.predict(X_test))

You can also use grid search CV to optimize the model and select the model which gives better precision, recall, or f1 score.

Types Of Naive Bayes Classifier

1. Multinomial Naive Bayes

It is mainly used for multi-class classification problems as we saw in the above example of whether to play golf or not. This is also used in News classification to decide in which category the article is belonging(i.e Sports, Politics, Technology, etc).

2. Bernoulli Naive Bayes

Bernouli naive bayes is similar to the multinomial Naive Bayes. The only difference in this is that in the predictors, there are only boolean values. for example, if there was wind or not.

Image credit: coding ninjas

Here p is the probability of success and q is the probability of failure (1-p). If you want to learn about this then go here.

3. Gaussian Naive Bayes

When there is continuous value in the predictors then Gaussian Naive Bayes is used. Gaussian naive Bayes assumes that each class follows the normal distribution. If you want to learn more about it then go here.

image credit: Quora

Advantages of Naive Bayes Classifier

  • Very useful in text classification problems.
  • Fast and easy algorithm to predict the class of datasets.
  • used for both binary and multi-class classification.

Disadvantages of Naive Bayes Classifier

  • It cannot learn the relationship among predictors. Because it assumes that every feature is independent and they have no effect on other features.

Applications of Naive Bayes Classifier

  • Face recognition
  • Weather prediction
  • Medical diagnosis
  • Spam detection
  • Sentimental analysis
  • Language identification
  • News Classification

Also Read: SMOTEENN: A tool to handle imbalanced datasets in machine learning

Leave a Reply

Your email address will not be published. Required fields are marked *