Which inputs matter
Feature importance estimates how much each input contributes to predictions. It guides pruning, debugging, and trust, but every method has caveats.
- Permutation importance shuffles one feature and measures the score drop.
- Tree gain sums how much splits on a feature reduce impurity.
- SHAP attributes each prediction to features in a fair additive way.
Read importance carefully
A high importance does not prove causation, and correlated features split credit unpredictably. Two redundant inputs may each look weak while together they dominate.
- Correlated features dilute each other's apparent importance.
- Importance on training data can reflect overfitting, not signal.
- A leaking feature often tops the chart for the wrong reason.
A quick view
Use importance to generate hypotheses, then verify them.
Key idea
Feature importance ranks how much inputs drive predictions, but correlation, overfitting, and leakage can distort it, so treat the ranking as a hypothesis to verify rather than ground truth.