Naïve Bayes: It is a classification technique based on bias theorem with assumption of independence events with normal distribution.
Naïve bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature even if features depend on each other or upon the existence of the other feature all these independently contribute to the probability. e.g. whether A fruit is an apple, an orange or a banana.
That’s why it’s known as naïve
Naïve Bayes model is easy to use and build for particularly large datasets.
According to Bias theorem
It states that: – Relationship between probability of hypothesis before getting the evidence and after getting the evidence P(H|E) is
PH|E=PE|H*PHPE
Where H is ‘Hypothesis’, E is ‘Evidence’
e.g. suppose you have a deck of cards and if a single card is drawn from it,
Probability that card is Queen =452=113
Queen is event this card is a queen the probability of the given is 113
Evidence provided as the card is single card face card
So,
PQueen|Face=PFace|Queen *PQueenPface
Let’s Assume a data scientist working in a major bank in NYC and he wants to classify a new client as eligible to retire or not.
Customers features are his/her age and salary
Prior Probability:
- Points can be classified as RED or BLUE and our task is to classify a new point to RED or BLUE.
- Prior Probability: Since we have more BLUE compared to RED, we can assume that our new point is twice as likely to be BLUE then RED
Likelihood:
For the new point, if there are more BLUE points in its vicinity, it is more likely that the new point is classify as BLUE
So, we draw a circle around the point, then we calculate the number of each points in the circle belonging to each class label
Posterior Probability:
Let’s combine prior probability and likelihood to create a posterior probability
Prior: suggests that X may be classified as BLUE because there are 2x as much BLUE points.
Likelihood: suggests that X is RED because there are more RED points in the vicinity of X
Bayes’ rule combines both to form a posterior probability.
Day | Outlook | Humidity | Wind | Play |
1 | Sunny | High | Week | No |
2 | Sunny | Normal | Strong | Yes |
3 | Overcast | High | Week | Yes |
4 | Rain | Normal | Strong | No |
5 | Rain | Normal | Week | No |
6 | Sunny | high | Strong | Yes |
7 | Overcast | Normal | Strong | Yes |
8 | Sunny | High | Week | Yes |
9 | Overcast | High | Strong | Yes |
10 | Rain | Normal | Week | No |
11 | Overcast | High | Strong | No |
12 | Rain | High | Week | No |
13 | Sunny | Normal | Week | No |
14 | Overcast | High | Strong | Yes |
Here we have whether data for a particular place we want to know at is there any chance to play game tomorrow
Frequency tables
For Outlook
Outlook | Play | Likelihood [ p(C) ] | |
Yes [ p(X|C) ] | No [ p(X) ] | ||
Sunny | 3 | 2 | 5/14 |
Overcast | 4 | 0 | 4/14 |
Rain | 3 | 2 | 5/14 |
Humidity | Play | likelihood | |
Yes | No | ||
Normal | 3 | 4 | 7/14 |
High | 6 | 1 | 7/14 |
For wind
Wind | Play | Likelihood | |
Yes | No | ||
Strong | 6 | 2 | 8/14 |
week | 3 | 3 | 6/15 |
Outlook P(X|C) =P(Sunny)/Yes) = 3/10 = 0.30
P(X) =P(Sunny) = 5/14 = 0.36
P(C) =P(Yes) = 10/14 = 0.71
Likelihood of ‘Yes’ given its Sunny
PC=PYes*PCPX
PC=0.3*0.360.71=0.591
Similarly
Likelihood for ‘NO’ = (0.4*0.36)/0.36 = 0.40
Suppose we have given data
Outlook = yes
Humidity=high
Wind=low
Then, Play =???
So, let’s start prediction that what is the possibility to game tomorrow by using naïve bays
Likelihood of “Yes” on that Day
=[PYes* Pyes* Pyes* Pyes]
=29*39*69*914=0.0199
Similarly for “NO” =[PNO* PNo* PNo*PNo]
=25*45*25*514=0.0166
Probability of play on that :
PYes=0.01990.199+0.166=55 %
PNo=0.0660.199+0.166=45 %
Here there is a 55 % chance to have a game on that day.
INDUSTRIAL USES OF THE MODEL:
NEWS Classification: by using Naïve bays model we can classify the news on the basis of there content in their type like sports news, political, national, international, finance, stock market, cinema, media education etc
Spam Mail Or message filter
Object detection
Medical diagnosis: is very useful and effective in the medical domain and gives accurate observation.
Whether Prediction (as we done in our example)
Types of Naïve Bayes:
Gaussian: It used in classification and it assumes that feature follow normal distribution
Multinomial: It is used for discrete count for e.g. Text Classification
Bernoulli: It is one step further and instead of word occurring in the document we have count how after ways word occurs in documents
Based on the dataset you can use Any of these.
Article by: Sachin Dubey
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other technical and Non Technical Internship Programs