A refined gradient booster
XGBoost is gradient boosting engineered for speed and accuracy. Its defining choices are a second order loss approximation and a regularized objective that includes the tree complexity directly in the split decision.
Second order optimization
Standard boosting uses only the gradient. XGBoost uses both the gradient and the Hessian, the second derivative, to approximate the loss with a Taylor expansion. This gives a closed form optimal leaf weight and a sharper split gain formula.
Built in regularization
- An L2 penalty on leaf weights, called lambda, shrinks predictions.
- A penalty per leaf, called gamma, prunes a split unless its gain exceeds the threshold.
- These appear in the objective itself, so pruning is principled, not a separate pass.
Engineering tricks
- A sparsity aware split finder learns a default direction for missing values.
- The histogram and approximate split algorithms bucket feature values for fast, parallel split search.
- Column and row subsampling add randomness and speed.
Key idea
XGBoost upgrades gradient boosting with second order Taylor approximation using gradient and Hessian, a regularized objective with lambda and gamma baked into split gain, plus sparsity aware and histogram tricks for speed.