Heuristic:Pola rs Polars Avoid Lambda In Aggregation

Knowledge Sources	Polars Aggregation Guide UDF Guide
Domains	Optimization, Python_Performance
Last Updated	2026-02-09 10:00 GMT

Overview

Avoid Python lambdas and custom functions in Polars aggregation contexts to prevent killing parallelization due to the Python GIL.

Description

Polars parallelizes aggregation computations across groups by executing them in separate threads. However, Python's Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode simultaneously. When a Python lambda or custom function is used in an aggregation (e.g., via `map_elements`), Polars must acquire the GIL for each group evaluation, serializing what would otherwise be parallel computation. The Polars expression API provides native Rust implementations for most operations, which execute outside the GIL and can be fully parallelized.

Usage

Apply this heuristic whenever you are tempted to use a Python `lambda`, `map_elements`, or `map_batches` with a custom Python function inside a `group_by().agg()` context. Instead, express the computation using the Polars expression API. This is Python-specific and does not apply to Rust, where closures can be executed concurrently.

The Insight (Rule of Thumb)

Action: Replace Python lambdas and custom functions with equivalent Polars expression API calls. Use `pl.when().then().otherwise()` instead of conditional lambdas, `pl.col().filter()` instead of filter lambdas, and built-in aggregation methods (`sum`, `mean`, `first`, `last`, `count`, `sort_by`) instead of custom reducers.
Value: Full multi-threaded parallelism across groups. Performance improvement scales with the number of CPU cores and the number of groups.
Trade-off: The Polars expression API may not support every possible computation. When a custom function is unavoidable, accept the GIL cost or consider implementing the logic as a Polars plugin in Rust.

Reasoning

The Polars documentation explicitly warns: "Python is generally slower than Rust. Besides the overhead of running 'slow' bytecode, Python has to remain within the constraints of the Global Interpreter Lock (GIL). This means that if you were to use a lambda or a custom Python function to apply during a parallelized phase, Polars' speed is capped running Python code, preventing any multiple threads from executing the function." Polars tries to parallelize aggregating functions over groups, so staying within the expression API is critical for performance.

Helper Python functions that return Polars expressions (not execute Python logic on data) are fine because they are resolved at plan-build time, not at execution time.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment