Spring 2024
Partial Dependence Plot (PDP)
Local Interpretable Model-agnostic Explanations (LIME)
SHAP (SHapley Additive exPlanations)
Examples and materials from Molnar’s book: https://christophm.github.io/interpretable-ml-book/
This dataset contains daily counts of rented bicycles from the bicycle rental company Capital-Bikeshare in Washington D.C., along with weather and seasonal information. The goal is to predict how many bikes will be rented depending on the weather and the day. The data can be downloaded from the UCI Machine Learning Repository.
Here is the list of features used in Molnar’s book:
Shows the marginal effect one or two features have on the predicted outcome of a machine learning model (J. H. Friedman 2001).
High level idea: marginalizing the machine learning model output over the distributions of the all other features to show the relationship between the feature we are interested in and the predicted outcome.
Pros
Cons
Training local surrograte models to explain individual predictions
https://arxiv.org/pdf/1602.04938.pdf
The idea is quite intuitive.
First, forget about the training data and imagine you only have the black box model where you can input data points and get the predictions of the model. You can probe the box as often as you want. Your goal is to understand why the machine learning model made a certain prediction. LIME tests what happens to the predictions when you give variations of your data into the machine learning model.
LIME generates a new dataset consisting of perturbed samples and the corresponding predictions of the black box model.
On this new dataset LIME then trains an interpretable model, which is weighted by the proximity of the sampled instances to the instance of interest. The interpretable model can be anything from the interpretable models chapter, for example Lasso or a decision tree. The learned model should be a good approximation of the machine learning model predictions locally, but it does not have to be a good global approximation. This kind of accuracy is also called local fidelity.
https://christophm.github.io/interpretable-ml-book/
https://arxiv.org/pdf/1602.04938.pdf
Random forest predictions given features x1 and x2.
Predicted classes: 1 (dark) or 0 (light).
Instance of interest (big yellow dot) and data sampled from a normal distribution (small dots).
Assign higher weight to points near the instance of interest. I.e., \(weight(p) = \sqrt{\frac{e^{-d^2}}{w^2}}\) where \(d\) is the distance between \(p\) and the instantce of interest, and \(w\) is the kernel width (self-defined).
Use both the samples and sample weights to train a linear classifier.
Signs of the grid show the classifications of the locally learned model from the weighted samples. The red line marks the decision boundary (P(class=1) = 0.5).
The official implementation uses a Ridge Classifier as the linear model for explanation.
Let us look at a concrete example. We go back to the bike rental data and turn the prediction problem into a classification: After taking into account the trend that the bicycle rental has become more popular over time, we want to know on a certain day whether the number of bicycles rented will be above or below the trend line. You can also interpret “above” as being above the average number of bicycles, but adjusted for the trend.
First we train a random forest with 100 trees on the classification task. On what day will the number of rental bikes be above the trend-free average, based on weather and calendar information?
The explanations are created with 2 features. The results of the sparse local linear models trained for two instances with different predicted classes:
Pros
Cons
Examples and materials from Molnar’s new book: https://christophmolnar.com/books/shap/
SHAP (Lundberg and Lee 2017a) is a game-theory-inspired method created to explain predictions made by machine learning models. SHAP generates one value per input feature (also known as SHAP values) that indicates how the feature contributes to the prediction of the specified data point.
Who’s going to pay for that taxi?
Alice, Bob, and Charlie have dinner together and share a taxi ride home. The total cost is $51. The question is, how should they divide the costs fairly?
The marginal contribution of a player to a coalition is the value of the coali- tion with the player minus the value of the coalition without the player. In the taxi example, the value of a coalition is equal to the cost of the ride as detailed in the above table. Therefore, the marginal contribution of, for instance, Charlie to a taxi already containing Bob is the cost of the taxi with Bob and Charlie, minus the cost of the taxi with Bob alone.
How to average these marginal contributions per passenger?
One way to answer this question is by considering all possible permutations of Alice, Bob, and Charlie. There are 3! = 3 * 2 * 1 = 6 possible permutations of passengers:
We can use these permutations to form coalitions, for example, for Alice.
In two of these cases, Alice was added to an empty taxi, and in one case, she was added to a taxi with only Bob. By weighting the marginal contributions accordingly, we calculate the following weighted average marginal contribution for Alice, abbreviating Alice, Bob, and Charlie to A, B, and C:
for Bob:
for Charlie:
The Shapley value is the weighted average of a player’s marginal contribu- tions to all possible coalitions.
Efficiency: The sum of the contributions must precisely add up to the payout.
Symmetry: If two players are identical, they should receive equal contributions.
Dummy or Null Player: The value of a player who doesn’t contribute to any coalition is zero.
Additivity: In a game with two value functions, the Shapley values for the sum can be expressed as the sum of the Shapley values.
These four axioms ensure the uniqueness of the Shapley values.
Consider the following scenario: You have trained a machine learning model \(f\) to predict apartment prices.
We want to evaluate the effort of cat-banned
We want to evaluate the effort of cat-banned
The Shapley value can be misinterpreted. The Shapley value of a feature value is not the difference of the predicted value after removing the feature from the model training. The interpretation of the Shapley value is: Given the current set of feature values, the contribution of a feature value to the difference between the actual prediction and the mean prediction is the estimated Shapley value.
The Shapley value is the wrong explanation method if you seek sparse explanations (explanations that contain few features). Explanations created with the Shapley value method always use all the features. Humans prefer selective explanations, such as those produced by LIME. LIME might be the better choice for explanations lay-persons have to deal with.
(From Molnar’s book)
Pros
Fairly distributed feature importance to a prediction
Contrastive explanations (can compare an instance to a subset or even to a single data point)
Solid theory
Cons
http://proceedings.mlr.press/v119/kumar20e/kumar20e.pdf