Explainability with SHAP

The idea

SHAP stands for Shapley additive explanations. It explains a single prediction by assigning each feature a contribution value. These values are based on Shapley values from cooperative game theory, which fairly divide a payout among players.

The game theory analogy

Treat the prediction as a payout and the features as players cooperating to produce it. The Shapley value of a feature is its average contribution across every possible ordering in which features are added to the model. This averaging is what gives the method its fairness properties.

Useful properties

Local accuracy means the feature contributions sum to the actual prediction minus a baseline
Consistency means if a feature matters more in a new model, its SHAP value does not shrink
It gives both local explanations for one prediction and global importance by aggregating many

The cost and approximations

Exact Shapley values require considering all feature subsets, which is exponential. So practical SHAP uses fast approximations, most notably TreeSHAP for tree models, which computes exact values efficiently, and KernelSHAP for general models.

Key idea

SHAP uses Shapley values to fairly split a prediction into per feature contributions that sum back to the output.

Explainability with SHAP

The idea

The game theory analogy

Useful properties

The cost and approximations

Key idea

Check yourself