Ensemble Learning Bagging And Boosting In Machine Learning

Ensemble Learning: Bagging And Boosting In Machine Learning

ENSEMBLE LEARNING:

Ensemble learning is a kind of approach in machine learning in which we can train multiple models/predictors using the same learning algorithm for our prediction /forecasting.

ENSEMBLE TECHNIQUES:

 Ensemble technique is one of the most powerful approach for classification and regression in the machine learning world . Most of the time we are using ensemble technique for our better and stable model

Why should we prefer ENSEMBLE TECHNIQUES  and what are different type of ENSEMBLE TECHNIQUES:

In ENSEMBLE TECHNIQUES we aggregate predictions from a group of predictors which may be classifier or reggresion and most of the time the prediction is better than the one obtained  using a single predictor/model. Generally ensemble  techniques  use decision tree algorithms to train their group of models/ predictors.

In this process  we are able to reduce variances.

Then simple question comes to our mind how it’s possible:

From this diagram we can easily understand.

Let’s assume the  standard deviation of each model presenting in the diagram is σ .

Then single model variance =σ2 

 value of model1,model2,model3 …….modeln=z1,z2,z3,z4…zn

But the expected value will be the average of all model/predictors

μ=z1+z2+z3+z4+…….zn / n            here n= no of predictors

Variances of final ensemble model = σ/ n  here n= no of predictors

So for that  we will get very low variances as compare to the individual models .So that only we are always trying to go with ensemble approach.

Types of ENSEMBLE TECHNIQUES:

                1:BAGGING

                2:BOOSTING

                3:STACKING

HERE we are going to discuss about bagging and boosting:

BAGGING:

We can use a bagging approach in two ways  one is BAGGING  AS  CLASSIFIER  and another one is BAGGING AS REGRESSION. In BAGGING we are able to build a number of models if  it is a regression  in that case it will take the average value of all predictors,If it is a classifier It will go for voting.

Let’s understand how bagging ensemble technique work?

From   this diagram  we can easily understand how Bagging works

Bagging is the type of Ensemble Technique in which a single training algorithm is used on different subsets of the training data where the subset sampling is done with  replacement(bootstrap).Once the algorithm is trained on all subsets.The bagging makes the prediction by aggregating all the predictions made by the algorithm on different subset. 

In this  above diagram we can see two processes are  there one is Bootstrapping  and second one is  Aggregation.

Bootstrapping:

Bootstrapping  is a technique  of sampling  different sets of data  from a given training set /original dataset by using replacement .Replacement means it can take different features in different  ways for  different models/predictors.

Aggregation:

After bootstrapping  the training dataset,it retrains on all different sets and aggregates the result.This technique is known as bootstrapping aggregation  or bagging.

ADVANTAGES  OF BAGGING

           1: Bagging significantly decreases the variance without increasing bias.

           2: Average of the result in case of regression is not going to change much.

          3: Variance decreases in higher amounts.

BOOSTING:

Boosting  is an ensemble approach (means it involves several trees) that starts from a weak decisson  and keeps on building the models/predictors such that the final prediction is the weighted sum of the weaker decision makers . The weights are assigned  based on the performance of an individual tree.

LET’S UNDERSTAND HOW IT’S WORK:

In case of boosting each classifier gets trained on the sample set and learns to predict . Ensemble parameters are calculated in a stage wise way which means that while calculating the subsequent weight the learning from the previous tree is considered as well.

The misclassification error then feeds into the next classifier  in a chain and corrects the mistake until the final model predicts accurate results.

TYPE OF BOOSTING

    1:ADA BOOST

    2:GRADIENT BOOST

3:XG BOOST

SIMILARITIES  AND DIFFERENCE  BETWEEN  BAGGING AND BOOSTING

1:Both are using ensemble techniques.

2: Both are trained data sets by using random sampling.

3:Both are able to reduce variance and make the final model/predictor more stable.

From this above diagram you can understand the basic difference between bagging and 

boosting.

APPLICATION OF BAGGING AND BOOSTING:

We should know about  where we can implement these two most powerful ensemble techniques of ml in our real life. You can use both the technique in all regression and classification problems.

EXAMPLE:

          1: Majority voting for agricultural land

          2:Person recognition

CONCLUSION:

Ensemble  technique in ML  is a best approach to make a better  decision  for any kind of problem. Rather than using a single predictor it’s better to use multiple predictors to reach out your final solution of your problem.We can use  r or python  to build our Ensemble(Bagging/Boosting model).

Article by: Ayosharya

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs

3 thoughts on “Ensemble Learning: Bagging And Boosting In Machine Learning”

  1. Pingback: Ensemble Learning - Pianalytix - Machine Learning

  2. Pingback: Random Forest - Pianalytix - Machine Learning - Data Science

  3. Pingback: Random Forest - Pianalytix - Machine Learning - Data Science

Leave a Comment

Your email address will not be published. Required fields are marked *