Ensembles in general
An ensemble combines many models into one stronger predictor. The two classic families are bagging and boosting, and they fix different problems.
Bagging
Bagging trains many models in parallel on random resamples of the data, then averages their outputs.
- Each model sees a bootstrap sample, drawn with replacement.
- Averaging cancels out their independent errors, lowering variance.
- It works best with high variance base learners like deep trees. Random forests are the famous example.
Boosting
Boosting trains models in sequence, each one focusing on the mistakes of the last.
- Later models reweight or fit the residual errors of earlier ones.
- This steadily reduces bias, producing a strong learner from weak ones.
- It is more prone to overfitting and is sensitive to noise.
Choosing between them
- Reach for bagging when a single model overfits and you want stability.
- Reach for boosting when models underfit and you want more accuracy.
Key idea
Bagging trains models in parallel to cut variance while boosting trains them sequentially on errors to cut bias.