On this tutorial, we uncover learn the way to make use of the SHAP-IQ bundle to uncover and visualize attribute interactions in machine learning fashions using Shapley Interaction Indices (SII), setting up on the muse of typical Shapley values.
Shapley values are good for explaining specific individual attribute contributions in AI fashions nonetheless fail to grab attribute interactions. Shapley interactions go a step further by separating specific individual outcomes from interactions, offering deeper insights—like how longitude and latitude collectively have an effect on house prices. On this tutorial, we’ll get started with the shapiq bundle to compute and uncover these Shapley interactions for any model. Check out the Full Codes proper right here
Placing within the dependencies
!pip arrange shapiq overrides scikit-learn pandas numpy
Data Loading and Pre-processing
On this tutorial, we’ll use the Bike Sharing dataset from OpenML. After loading the knowledge, we’ll break up it into teaching and testing items to arrange it for model teaching and evaluation. Check out the Full Codes proper right here
import shapiq
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
import numpy as np
# Load data
X, y = shapiq.load_bike_sharing(to_numpy=True)
# Break up into teaching and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Model Teaching and Effectivity Evaluation
# Put together model
model = RandomForestRegressor()
model.match(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
# Take into account
mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
r2 = r2_score(y_test, y_pred)
print(f"R² Score: {r2:.4f}")
print(f"Indicate Absolute Error: {mae:.4f}")
print(f"Root Indicate Squared Error: {rmse:.4f}")
Organising an Explainer
We organize a TabularExplainer using the shapiq bundle to compute Shapley interaction values based mostly totally on the k-SII (k-order Shapley Interaction Index) method. By specifying max_order=4, we allow the explainer to consider interactions of as a lot as 4 choices concurrently, enabling deeper insights into how groups of choices collectively impression model predictions. Check out the Full Codes proper right here
# organize an explainer with k-SII interaction values as a lot as order 4
explainer = shapiq.TabularExplainer(
model=model,
data=X,
index="k-SII",
max_order=4
)
Explaining a Native Event
We select a specific test event (index 100) to generate native explanations. The code prints the true and predicted values for this event, adopted by a breakdown of its attribute values. This helps us understand the exact inputs handed to the model and items the context for decoding the Shapley interaction explanations that observe. Check out the Full Codes proper right here
from tqdm.asyncio import tqdm
# create explanations for numerous orders
feature_names = itemizing(df[0].columns) # get the attribute names
n_features = len(feature_names)
# select an space event to be outlined
instance_id = 100
x_explain = X_test[instance_id]
y_true = y_test[instance_id]
y_pred = model.predict(x_explain.reshape(1, -1))[0]
print(f"Event {instance_id}, True Value: {y_true}, Predicted Value: {y_pred}")
for i, attribute in enumerate(feature_names):
print(f"{attribute}: {x_explain[i]}")
Analyzing Interaction Values
We use the explainer.make clear() method to compute Shapley interaction values for a specific data event (X[100]) with a worth vary of 256 model evaluations. This returns an InteractionValues object, which captures how specific individual choices and their mixtures have an effect on the model’s output. The max_order=4 means we take note of interactions involving as a lot as 4 choices. Check out the Full Codes proper right here
interaction_values = explainer.make clear(X[100], worth vary=256)
# analyse interaction values
print(interaction_values)
First-Order Interaction Values
To keep up points simple, we compute first-order interaction values—i.e., customary Shapley values that seize solely specific individual attribute contributions (no interactions).
By setting max_order=1 throughout the TreeExplainer, we’re saying:
“Inform me how lots each attribute individually contributes to the prediction, with out considering any interaction outcomes.”
These values are sometimes referred to as customary Shapley values. For each attribute, it estimates the widespread marginal contribution to the prediction all through all doable permutations of attribute inclusion. Check out the Full Codes proper right here
feature_names = itemizing(df[0].columns)
explainer = shapiq.TreeExplainer(model=model, max_order=1, index="SV")
si_order = explainer.make clear(x=x_explain)
si_order
Plotting a Waterfall chart
A Waterfall chart visually breaks down a model’s prediction into specific individual attribute contributions. It begins from the baseline prediction and gives/subtracts each attribute’s Shapley price to reach the final word predicted output.
In our case, we’ll use the output of TreeExplainer with max_order=1 (i.e., specific individual contributions solely) to visualise the contribution of each attribute. Check out the Full Codes proper right here
si_order.plot_waterfall(feature_names=feature_names, current=True)
In our case, the baseline price (i.e., the model’s anticipated output with none attribute information) is 190.717.
As we add the contributions from specific individual choices (order-1 Shapley values), we are going to observe how every pushes the prediction up or pulls it down:
- Choices like Local weather and Humidity have a optimistic contribution, rising the prediction above the baseline.
- Choices like Temperature and 12 months have a robust detrimental impression, pulling the prediction down by −35.4 and −45, respectively.
Normal, the Waterfall chart helps us understand which choices are driving the prediction, and thru which course—providing helpful notion into the model’s decision-making.
Check out the Full Codes proper right here. Be pleased to check out our GitHub Net web page for Tutorials, Codes and Notebooks. Moreover, be at liberty to watch us on Twitter and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our E-newsletter.

I’m a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi, and I’ve a keen curiosity in Data Science, notably Neural Networks and their software program in diversified areas.

Elevate your perspective with NextTech Data, the place innovation meets notion.
Uncover the most recent breakthroughs, get distinctive updates, and be part of with a worldwide neighborhood of future-focused thinkers.
Unlock tomorrow’s traits within the current day: study additional, subscribe to our e-newsletter, and switch into part of the NextTech group at NextTech-news.com
Keep forward of the curve with NextBusiness 24. Discover extra tales, subscribe to our e-newsletter, and be part of our rising neighborhood at nextbusiness24.com

