Welcome back to the Visualization for Machine Learning Lab!

Week 6: Interpreting Black Box Models with LIME AND SHAP (and PCA if time)


  • Homework 2 due TONIGHT at 11:59pm
  • Homework 1 grades released - ask Rithwick (rg4361@nyu.edu) any questions about grades

Interpreting Black Box Models

  • Not as intuitive as white box models, can be hard to define a model’s decision boundary in a human-understandable manner
  • However there are ways to analyze what factors affect model outputs!
    • LIME
    • SHAP

Local Interpretable Model-Agnostic Explanations (LIME)

  • What is LIME?
    • LIME is a python library that explains the prediction of any classifier by learning an interpretable model locally around the prediction


  • Why is LIME a good model explainer?
    • Interpretable by non-experts
    • Local fidelity (replicates the model’s behavior in the vicinity of the instance being predicted)
    • Model agnostic (does not make any assumptions about the model)
    • Global perspective (when used on a representative set, LIME can provide a global intuition of the model)


  • How does LIME work?
    • For an in depth explanation of the math, see Sharma (2020)
  • Fidelity-Interpretability Tradoff
    • We want an explainer that is faithful (replicates our model’s behavior locally) and interpretable. To achieve this, LIME minimizes


f: an original predictor
x: original features
g: explanation model which could be a linear model, decision tree, or falling rule lists
Pi: proximity measure between an instance of z to x to define locality around x. It weighs z’ (perturbed instances) depending upon their distance from x.
First Term: the measure of the unfaithfulness of g in approximating f in the locality defined by Pi. This is termed as locality-aware loss in the original paper
Last term: a measure of model complexity of explanation g (e.g. if your explanation model is a decision tree it can be the depth of the tree)

LIME Example in Python

We first import the relevant libraries:

import pandas as pd, numpy as np
from sklearn import datasets
from sklearn.decomposition import PCA
from matplotlib import pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.model_selection import train_test_split
import lime, lime.lime_tabular, shap

LIME Example in Python

Recall the Iris dataset, which contains flowers that can be sorted into 3 subspecies classes based on 4 features:

data = datasets.load_iris()

X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2

LIME Example in Python

We ignore Class 0 for now (we will see why shortly), and split the data into a training set (80%) and test set (20%):

#Ignoring one of the three classes

X = X[y != 0]
y = y[y != 0]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

LIME Example in Python

We train a random forest classifier on our training set, and generate predictions for our test set:

classifier = RandomForestClassifier(random_state=42)

classifier.fit(X_train, y_train)

predicted = classifier.predict(X_test)

pre  = precision_score(y_test, predicted)
rec  = recall_score(y_test, predicted)
acc = accuracy_score(y_test, predicted)

print("Precision: ", pre)
print("Recall: ", rec)
print("Accuracy: ", acc)
Precision:  1.0
Recall:  0.9166666666666666
Accuracy:  0.95

LIME Example in Python

We use lime to create an explainer based on our training set, and generate an explanation for the 10th sample in our test set:

explainer1 = lime.lime_tabular.LimeTabularExplainer(X_train.values,
                                                   class_names=['class 1', 'class 2'],

lime_values = explainer1.explain_instance(X_test.values[10], classifier.predict_proba, num_features=4)
##some lime_values properties intercept, local_pred, score
Intercept 0.6704191905472919
Prediction_local [0.233346]
Right: 0.22

LIME Example in Python

  • We see that this explanation has three parts:
    • On the left, we see that the classifier estimated there was a 78% chance the sample was from Class 1, and a 22% chance the sample was from Class 2
    • In the center, we see that petal length and width increased the proability that the sample was from Class 1, while sepal length and width increased the proability that the sample was from Class 2. Petal length and width also had greater influence on their respective increase than sepal length and width
    • On the right, we see the actual LIME values for each feature. The color of each row corresponds to the class that feature is "voting for"