Environment:Scikit learn Scikit learn OpenMP Thread Configuration

Knowledge Sources	scikit-learn Parallelism docs
Domains	Infrastructure, Parallelism
Last Updated	2026-02-08 15:00 GMT

Overview

OpenMP and BLAS thread configuration environment for controlling scikit-learn parallel computation via environment variables and threadpoolctl.

Description

Scikit-learn uses multiple levels of parallelism: OpenMP for Cython-level parallelism (pairwise distances, tree building), BLAS libraries (OpenBLAS, MKL, BLIS) for linear algebra operations, and joblib for Python-level multiprocessing. These layers can interfere with each other causing thread oversubscription. This environment documents the thread control variables and workarounds used by scikit-learn to manage parallel execution safely.

Usage

Use this environment configuration when running scikit-learn on multi-core systems, in Docker/containerized environments (where cgroups may limit visible CPUs), or when encountering performance issues from thread oversubscription. It is particularly relevant for the BaseForest_Fit (parallel tree building), Cross_Validate (parallel fold evaluation), and GridSearchCV (parallel parameter search) implementations.

System Requirements

Category	Requirement	Notes
OpenMP	Runtime library (libgomp/libomp)	Built into compiled Cython extensions
BLAS	OpenBLAS, MKL, or BLIS	One of these must be available via NumPy/SciPy
threadpoolctl	>= 3.2.0	Used internally by sklearn to manage thread pools

Dependencies

System Packages

OpenMP runtime library (`libgomp` on Linux, `libomp` on macOS)
One BLAS library: OpenBLAS, Intel MKL, or BLIS

Python Packages

`threadpoolctl` >= 3.2.0
`joblib` >= 1.3.0

Credentials

The following environment variables control thread behavior (not secrets):

`OMP_NUM_THREADS`: Number of OpenMP threads (overrides automatic CPU detection)
`MKL_NUM_THREADS`: Number of threads for Intel MKL BLAS operations
`OPENBLAS_NUM_THREADS`: Number of threads for OpenBLAS operations
`BLIS_NUM_THREADS`: Number of threads for BLIS operations
`KMP_DUPLICATE_LIB_OK`: Allow multiple OpenMP libraries (macOS workaround, set to "True" by sklearn)
`KMP_INIT_AT_FORK`: Intel OpenMP fork workaround (set to "FALSE" by sklearn)

Quick Install

# threadpoolctl is installed automatically with scikit-learn
pip install scikit-learn

# To control threads at runtime:
export OMP_NUM_THREADS=4
export OPENBLAS_NUM_THREADS=4

# Or use threadpoolctl in Python:
# from threadpoolctl import threadpool_limits
# with threadpool_limits(limits=4):
#     model.fit(X, y)

Code Evidence

OpenMP environment workarounds from `sklearn/__init__.py:48-60`:

# On OSX, we can get a runtime error due to multiple OpenMP libraries loaded
# simultaneously. This can happen for instance when calling BLAS inside a
# prange. Setting the following environment variable allows multiple OpenMP
# libraries to be loaded.
os.environ.setdefault("KMP_DUPLICATE_LIB_OK", "True")

# Workaround issue discovered in intel-openmp 2019.5:
# https://github.com/ContinuumIO/anaconda-issues/issues/11294
os.environ.setdefault("KMP_INIT_AT_FORK", "FALSE")

Unstable OpenBLAS detection from `sklearn/utils/fixes.py:343-373`:

def _in_unstable_openblas_configuration():
    """Return True if in an unstable configuration for OpenBLAS"""
    modules_info = _get_threadpool_controller().info()
    open_blas_used = any(info["internal_api"] == "openblas" for info in modules_info)
    if not open_blas_used:
        return False
    # OpenBLAS 0.3.16 fixed instability for arm64
    openblas_arm64_stable_version = parse_version("0.3.16")
    for info in modules_info:
        if info["internal_api"] != "openblas":
            continue
        openblas_version = info.get("version")
        openblas_architecture = info.get("architecture")
        if openblas_version is None or openblas_architecture is None:
            return True
        if (
            openblas_architecture == "neoversen1"
            and parse_version(openblas_version) < openblas_arm64_stable_version
        ):
            return True
    return False

Config propagation warning from `sklearn/utils/parallel.py:29-37`:

warnings.warn(
    (
        "`sklearn.utils.parallel.Parallel` needs to be used in "
        "conjunction with `sklearn.utils.parallel.delayed` instead of "
        "`joblib.delayed` to correctly propagate the scikit-learn "
        "configuration to the joblib workers."
    ),
    UserWarning,
)

Common Errors

Error Message	Cause	Solution
`OMP: Error #15: Initializing libiomp5 ... but found libiomp5md already initialized`	Multiple OpenMP runtimes loaded on macOS	Set `KMP_DUPLICATE_LIB_OK=True` (done automatically by sklearn)
Performance degradation with `n_jobs > 1`	Thread oversubscription: OpenMP + joblib both spawning threads	Set `OMP_NUM_THREADS=1` when using `n_jobs > 1`
Hang or deadlock in parallel code	Fork-safety issue with OpenMP	Set `KMP_INIT_AT_FORK=FALSE` (done automatically by sklearn)
Numerical instability on ARM64	OpenBLAS < 0.3.16 on Neoverse N1	Upgrade OpenBLAS to >= 0.3.16 or use MKL

Compatibility Notes

macOS: Requires `KMP_DUPLICATE_LIB_OK=True` due to potential conflicts between system and Anaconda OpenMP libraries. Scikit-learn sets this automatically on import.
ARM64 (aarch64): OpenBLAS versions before 0.3.16 have known instabilities on Neoverse N1 architecture. Scikit-learn detects this and marks affected tests accordingly.
Docker/cgroups: When `OMP_NUM_THREADS` is not set, scikit-learn uses the minimum of `omp_get_max_threads()` and the CPU count (accounting for cgroup quotas).
pytest-xdist: When running tests in parallel with xdist, thread limits are automatically adjusted to `cpu_count // worker_count` to prevent oversubscription.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment