Bagging Vs Boosting

Bagging and boosting are the two classic ensembling strategies. They differ in how they build models and in which error component they target.

Bagging

Trains many models in parallel on different bootstrap samples of the data.
Combines them by averaging or voting.
Primarily reduces variance, so it tames high variance learners like deep trees.
Random forests are the famous example, adding feature randomness for more diversity.

Boosting

Trains models sequentially, each focusing on examples the previous models got wrong.
Combines them as a weighted sum.
Primarily reduces bias, turning weak learners into a strong one.
Gradient boosting and its libraries dominate many tabular competitions.

Choosing between them

Bagging is robust and parallel, less prone to overfitting and easy to scale.
Boosting is often more accurate but more sensitive to noise and needs careful tuning of learning rate and rounds.

Key idea

Bagging trains models in parallel and averages them to cut variance, while boosting trains them sequentially to fix errors and cut bias, trading robustness for accuracy.

Bagging Vs Boosting

Bagging Vs Boosting

Bagging

Boosting

Choosing between them

Key idea

Check yourself