Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn Encode

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Data Encoding
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete utility module for encoding categorical values and finding unique elements provided by scikit-learn.

Description

The _encode module provides helper functions for finding unique values in arrays, including support for Python object dtypes and proper NaN handling. It includes _unique, _unique_np, and _unique_python functions that correctly handle missing values and work with both numpy arrays and Array API-compatible backends.

Usage

Use these utilities when you need to find unique values in label or feature arrays, particularly when NaN values or object dtypes are present, such as during label encoding or ordinal encoding operations.

Code Reference

Source Location

Signature

def _unique(values, *, return_inverse=False, return_counts=False):
    ...

def _unique_np(values, return_inverse=False, return_counts=False):
    ...

def _unique_python(values, return_inverse=False, return_counts=False):
    ...

Import

from sklearn.utils._encode import _unique

I/O Contract

Inputs

Name Type Required Description
values ndarray Yes Values to check for unique elements
return_inverse bool No If True, also return the indices of the unique values
return_counts bool No If True, also return the number of times each unique item appears

Outputs

Name Type Description
unique ndarray The sorted unique values
unique_inverse ndarray Indices to reconstruct the original array from unique (if requested)
unique_counts ndarray Number of times each unique value appears (if requested)

Usage Examples

Basic Usage

import numpy as np
from sklearn.utils._encode import _unique

values = np.array([3, 1, 2, 3, 1, 2])
uniques = _unique(values)
print(uniques)  # array([1, 2, 3])

uniques, inverse = _unique(values, return_inverse=True)
print(inverse)  # array([2, 0, 1, 2, 0, 1])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment