Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn SetOutput

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Output Configuration
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete utility module for configuring transformer output containers provided by scikit-learn.

Description

The _set_output module implements the set_output API that allows scikit-learn transformers to return pandas DataFrames or Polars DataFrames instead of NumPy arrays. It provides the _SetOutputMixin, container adapter protocols for pandas and Polars, and functions to wrap transformer output with appropriate metadata (column names, index).

Usage

Use the set_output API on any scikit-learn transformer to configure its output format. Set transform="pandas" or transform="polars" to get DataFrame outputs from transform, fit_transform, and related methods.

Code Reference

Source Location

Signature

class ContainerAdapterProtocol(Protocol):
    container_lib: str
    def create_container(self, X_output, X_original, columns, inplace=False):
        ...
    def is_supported_container(self, X):
        ...

class PandasAdapter:
    ...

class PolarsAdapter:
    ...

class _SetOutputMixin:
    def set_output(self, *, transform=None):
        ...

def _safe_set_output(estimator, *, transform=None):
    ...

Import

from sklearn.utils._set_output import _SetOutputMixin, _safe_set_output

I/O Contract

Inputs

Name Type Required Description
transform str or None No Output container type: "pandas", "polars", or None for default (ndarray)
estimator estimator instance Yes Estimator to configure output for

Outputs

Name Type Description
self estimator The estimator with output configured
X_output DataFrame or ndarray Transformed data in the configured output format

Usage Examples

Basic Usage

from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris
import pandas as pd

X, y = load_iris(return_X_y=True, as_frame=True)
scaler = StandardScaler().set_output(transform="pandas")
X_scaled = scaler.fit_transform(X)
print(type(X_scaled))  # <class 'pandas.core.frame.DataFrame'>
print(X_scaled.columns.tolist())  # Original feature names preserved

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment