The idea
SHAP stands for Shapley additive explanations. It explains a single prediction by assigning each feature a contribution value. These values are based on Shapley values from cooperative game theory, which fairly divide a payout among players.
The game theory analogy
Treat the prediction as a payout and the features as players cooperating to produce it. The Shapley value of a feature is its average contribution across every possible ordering in which features are added to the model. This averaging is what gives the method its fairness properties.
Useful properties
- Local accuracy means the feature contributions sum to the actual prediction minus a baseline
- Consistency means if a feature matters more in a new model, its SHAP value does not shrink
- It gives both local explanations for one prediction and global importance by aggregating many
The cost and approximations
Exact Shapley values require considering all feature subsets, which is exponential. So practical SHAP uses fast approximations, most notably TreeSHAP for tree models, which computes exact values efficiently, and KernelSHAP for general models.
Key idea
SHAP uses Shapley values to fairly split a prediction into per feature contributions that sum back to the output.