ENSEMBLE LEARNING: Ensemble learning: Bagging And Boosting – kind of approach in machine learning That can train multiple models/predictors. Using the same learning algorithm for our prediction /forecasting.
ENSEMBLE TECHNIQUES:
Ensemble technique is one of the most powerful approaches for classification and regression in the machine learning world. Most of the time we are using the ensemble technique for our better and stable model
Why should we prefer ENSEMBLE TECHNIQUES & what are different type of ENSEMBLE TECHNIQUES?
In ENSEMBLE TECHNIQUES we aggregate predictions from a group of predictors which may be classifier or reggresion and most of the time the prediction is better than the one obtained using a single predictor/model. Generally ensemble techniques use decision tree algorithms to train their group of models/ predictors.
In this process we are able to reduce variances.
Then simple question comes to our mind how it’s possible:
From this diagram we can easily understand.
Let’s assume the standard deviation of each model presenting in the diagram is σ .
Then single model variance =σ2
value of model1,model2,model3 …….modeln=z1,z2,z3,z4…zn
But the expected value will be the average of all model/predictors
μ=z1+z2+z3+z4+…….zn / n here n= no of predictors
Variances of final ensemble model = σ2 / n here n= no of predictors
So for that we will get very low variances as compare to the individual models .So that only we are always trying to go with ensemble approach.
Types of ENSEMBLE TECHNIQUES:
1:BAGGING
2:BOOSTING
3:STACKING
HERE we are going to discuss about bagging and boosting:
BAGGING:-
We can use a bagging approach in two ways one is BAGGING AS CLASSIFIER and another one is BAGGING AS REGRESSION. In BAGGING we are able to build a number of models if it is a regression in that case it will take the average value of all predictors,If it is a classifier It will go for voting.
Let’s understand how bagging ensemble technique work?
From this diagram we can easily understand how Bagging works
Bagging is the type of Ensemble Technique in which a single training algorithm is used on different subsets of the training data where the subset sampling is done with replacement(bootstrap).Once the algorithm is trained on all subsets.The bagging makes the prediction by aggregating all the predictions made by the algorithm on different subset.
In this above diagram we can see two processes are there one is Bootstrapping and second one is Aggregation.
Bootstrapping:
Bootstrapping is a technique of sampling different sets of data from a given training set /original dataset by using replacement .Replacement means it can take different features in different ways for different models/predictors.
Aggregation:
After bootstrapping the training dataset,it retrains on all different sets and aggregates the result.This technique is known as bootstrapping aggregation or bagging.
ADVANTAGES OF BAGGING:-
1: Bagging significantly decreases the variance without increasing bias.
2: Average of the result in case of regression is not going to change much.
3: Variance decreases in higher amounts.
BOOSTING:
Boosting is an ensemble approach (means it involves several trees) that starts from a weak decision and keeps on building the models/predictors such that the final prediction is the weighted sum of the weaker decision-makers. However, The weights assigned depends on the performance of an individual tree.
LET’S UNDERSTAND HOW IT’S WORK:
In case of boosting each classifier gets trained on the sample set and learns to predict. Ensemble parameters are calculated in a stage-wise way which means that while calculating the subsequent weight the learning from the previous three is considered as well.
The misclassification error then feeds into the next classifier in a chain and corrects the mistake until the final model predicts accurate results.
TYPE OF BOOSTING
1:ADA BOOST
2:GRADIENT BOOST
3:XG BOOST
SIMILARITIES AND DIFFERENCES | BAGGING AND BOOSTING:-
1:Both are using ensemble techniques.
2: Both are trained data sets by using random sampling.
3Ultimately, both are able to reduce variance and make the final model/predictor more stable.
Hence, From this above diagram you can understand easily.
APPLICATION OF BAGGING AND BOOSTING:
Therefore, We should know where we can implement these two most powerful ensemble techniques of ml in our real life. You can use both the technique in all regression and classification problems.
EXAMPLE:
1: Majority voting for agricultural land
2:Person recognition
CONCLUSION:
Ensemble technique in ML is the best approach to make a better decision for any kind of problem. Rather than using a single predictor. Eventually, it’s better to use multiple predictors to reach out to your final solution to your problem. We can use or python to build our Ensemble(Bagging and Boosting model).
Article By: Ayosharya Das
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs