Implementation:Fastai Fastbook Feature Importances

Knowledge Sources	fastbook sklearn feature importance sklearn partial dependence treeinterpreter
Domains	Model Interpretation, Feature Selection
Last Updated	2026-02-09 17:00 GMT

Overview

Concrete tools for analyzing feature importance, partial dependence, and per-row prediction decomposition, provided by scikit-learn, treeinterpreter, and pandas.

Description

This implementation covers three complementary model interpretation techniques used in the fastbook Tabular Modeling chapter:

model.feature_importances_: A scikit-learn attribute on fitted tree ensemble models that provides a normalized array of importance scores (summing to 1.0) based on mean decrease in impurity across all trees.
plot_partial_dependence: A scikit-learn function that computes and visualizes partial dependence plots, showing how the model's average prediction changes as a single feature varies while all other features remain at their observed values.
treeinterpreter.predict: A third-party library function that decomposes each prediction into a bias term (global mean) plus per-feature contributions, enabling row-level explanation of model decisions.

Usage

Use these tools after training a RandomForestRegressor to understand model behavior. Start with global feature importance to identify the most influential features and remove low-importance ones. Then use partial dependence plots to understand the shape of each important feature's relationship with the target. Finally, use treeinterpreter for case-by-case explanation of individual predictions in production.

Code Reference

Source Location

Repository: fastbook
File: translations/cn/09_tabular.md (Lines 753-1070)
Note: These are external tools (scikit-learn, treeinterpreter) demonstrated in the fastbook chapter.

Signature

# 1. Global feature importance (attribute on fitted model)
model.feature_importances_  # numpy.ndarray of shape (n_features,)

# 2. Partial dependence plots
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(estimator, X, features, grid_resolution=20, ax=None)

# 3. Tree interpretation (per-row decomposition)
import treeinterpreter
prediction, bias, contributions = treeinterpreter.predict(model, X)

Import

from sklearn.inspection import plot_partial_dependence
import treeinterpreter
import pandas as pd

I/O Contract

Inputs

Name	Type	Required	Description
model	RandomForestRegressor (fitted)	Yes	A trained random forest model from which to extract importances and predictions.
X (for partial dependence)	pandas.DataFrame or numpy.ndarray	Yes	Feature matrix (typically the validation set features) over which to compute partial dependence.
features (for partial dependence)	list of str or list of int	Yes	Column names or indices of features to plot.
grid_resolution	int	No	Number of grid points for partial dependence computation. Default 20.
X (for treeinterpreter)	numpy.ndarray	Yes	Feature matrix for the rows to decompose. Use `df.values` to convert from DataFrame.

Outputs

Name	Type	Description
feature_importances_	numpy.ndarray (n_features,)	Normalized importance scores for each feature, summing to 1.0. Higher values indicate features used for more impactful splits.
Partial dependence plot	matplotlib figure	Line plot showing how average prediction varies as the specified feature changes, with all other features held constant.
prediction (treeinterpreter)	numpy.ndarray (n_rows, 1)	The model's prediction for each input row (same as `model.predict(X)`).
bias (treeinterpreter)	numpy.ndarray (n_rows, 1)	The global mean of the training target, representing the prediction before any feature-based adjustments.
contributions (treeinterpreter)	numpy.ndarray (n_rows, n_features)	Per-feature contribution for each row. `bias + contributions.sum(axis=1) == prediction`.

Usage Examples

Basic Usage

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.inspection import plot_partial_dependence
import treeinterpreter

# Assume 'm' is a fitted RandomForestRegressor and 'xs' is the feature DataFrame

# --- 1. Global Feature Importance ---
def rf_feat_importance(m, df):
    return pd.DataFrame({'cols': df.columns, 'imp': m.feature_importances_}
                        ).sort_values('imp', ascending=False)

fi = rf_feat_importance(m, xs)
print(fi[:10])

# Plot top 30 features
fi[:30].plot('cols', 'imp', 'barh', figsize=(12, 7), legend=False)
plt.title('Feature Importances')
plt.show()

# --- 2. Remove Low-Importance Features ---
to_keep = fi[fi.imp > 0.005].cols
xs_imp = xs[to_keep]
valid_xs_imp = valid_xs[to_keep]
# Retrain and verify RMSE is maintained

# --- 3. Partial Dependence Plots ---
fig, ax = plt.subplots(figsize=(12, 4))
plot_partial_dependence(m, valid_xs_imp, ['YearMade', 'ProductSize'],
                        grid_resolution=20, ax=ax)
plt.show()

# --- 4. Tree Interpretation (Per-Row) ---
row = valid_xs_imp.iloc[:5]
prediction, bias, contributions = treeinterpreter.predict(m, row.values)

# For the first row:
print(f"Prediction: {prediction[0]}")
print(f"Bias:       {bias[0]}")
print(f"Sum:        {bias[0] + contributions[0].sum()}")

Redundancy Analysis

from sklearn.ensemble import RandomForestRegressor

# Quick OOB score function for comparing feature subsets
def get_oob(df):
    m = RandomForestRegressor(n_estimators=40, min_samples_leaf=15,
        max_samples=50000, max_features=0.5, n_jobs=-1, oob_score=True)
    m.fit(df, y)
    return m.oob_score_

# Baseline
print(f"Baseline OOB: {get_oob(xs_imp)}")

# Test removing potentially redundant columns one at a time
for c in ('saleYear', 'saleElapsed', 'ProductGroupDesc', 'ProductGroup',
          'fiModelDesc', 'fiBaseModel'):
    print(f"Drop {c}: OOB = {get_oob(xs_imp.drop(c, axis=1))}")

# Drop multiple redundant columns
to_drop = ['saleYear', 'ProductGroupDesc', 'fiBaseModel', 'Grouser_Tracks']
xs_final = xs_imp.drop(to_drop, axis=1)
print(f"Final OOB: {get_oob(xs_final)}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment