← Lessons

quiz vs the machine

Platinum1720

Machine Learning

Bagging Vs Boosting

Contrast parallel variance reduction with sequential bias reduction.

5 min read · advanced · beat Platinum to climb

Bagging Vs Boosting

Bagging and boosting are the two classic ensembling strategies. They differ in how they build models and in which error component they target.

Bagging

  • Trains many models in parallel on different bootstrap samples of the data.
  • Combines them by averaging or voting.
  • Primarily reduces variance, so it tames high variance learners like deep trees.
  • Random forests are the famous example, adding feature randomness for more diversity.

Boosting

  • Trains models sequentially, each focusing on examples the previous models got wrong.
  • Combines them as a weighted sum.
  • Primarily reduces bias, turning weak learners into a strong one.
  • Gradient boosting and its libraries dominate many tabular competitions.

Choosing between them

  • Bagging is robust and parallel, less prone to overfitting and easy to scale.
  • Boosting is often more accurate but more sensitive to noise and needs careful tuning of learning rate and rounds.

Key idea

Bagging trains models in parallel and averages them to cut variance, while boosting trains them sequentially to fix errors and cut bias, trading robustness for accuracy.

Check yourself

Answer to earn rating on the learn ladder.

1. How are models trained in bagging versus boosting?

2. Which error component does boosting mainly reduce?

3. Which is generally more sensitive to noisy data?