Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Datasets Tqdm Utils

From Leeroopedia

Overview

Tqdm Utils provides a custom tqdm progress bar wrapper with global enable/disable control for the datasets library. The module wraps tqdm.auto.tqdm with a subclass that respects a global disabled state, and provides helper functions to programmatically enable or disable progress bars. An environment variable HF_DATASETS_DISABLE_PROGRESS_BARS can override programmatic settings and takes priority.

This module is part of the huggingface/datasets repository.

  • Source file: src/datasets/utils/tqdm.py (135 lines)
  • Domain: UI, Utilities
  • Import: from datasets.utils.tqdm import tqdm, enable_progress_bars, disable_progress_bars

Class: tqdm

A subclass of tqdm.auto.tqdm that overrides the disable argument when progress bars are globally disabled.

class tqdm(old_tqdm):
    """
    Class to override `disable` argument in case progress bars are globally disabled.

    Taken from https://github.com/tqdm/tqdm/issues/619#issuecomment-619639324.
    """

    def __init__(self, *args, **kwargs):
        if are_progress_bars_disabled():
            kwargs["disable"] = True
        elif kwargs.get("disable") is None and os.getenv("TQDM_POSITION") == "-1":
            # Force-enable progress bars in cloud environments when disable=None
            kwargs["disable"] = False
        super().__init__(*args, **kwargs)

    def __delattr__(self, attr: str) -> None:
        """Fix for https://github.com/huggingface/datasets/issues/6066"""
        try:
            super().__delattr__(attr)
        except AttributeError:
            if attr != "_lock":
                raise

Behavior:

  • If progress bars are globally disabled, the disable keyword argument is forced to True, regardless of what the caller passes.
  • If disable is None and the environment variable TQDM_POSITION is set to "-1", progress bars are force-enabled to support cloud environments.
  • The __delattr__ override silently ignores AttributeError for the _lock attribute, fixing a known issue (#6066).

Functions

disable_progress_bars

Disables progress bars globally for the datasets library. If the environment variable HF_DATASETS_DISABLE_PROGRESS_BARS is explicitly set to 0 (False), a warning is emitted and the function has no effect.

def disable_progress_bars() -> None:
    if HF_DATASETS_DISABLE_PROGRESS_BARS is False:
        warnings.warn(
            "Cannot disable progress bars: environment variable "
            "`HF_DATASETS_DISABLE_PROGRESS_BAR=0` is set and has priority."
        )
        return
    global _hf_datasets_progress_bars_disabled
    _hf_datasets_progress_bars_disabled = True

enable_progress_bars

Enables progress bars globally for the datasets library. If the environment variable HF_DATASETS_DISABLE_PROGRESS_BARS is explicitly set to 1 (True), a warning is emitted and the function has no effect.

def enable_progress_bars() -> None:
    if HF_DATASETS_DISABLE_PROGRESS_BARS is True:
        warnings.warn(
            "Cannot enable progress bars: environment variable "
            "`HF_DATASETS_DISABLE_PROGRESS_BAR=1` is set and has priority."
        )
        return
    global _hf_datasets_progress_bars_disabled
    _hf_datasets_progress_bars_disabled = False

are_progress_bars_disabled

Returns whether progress bars are globally disabled. Returns True if disabled, False otherwise.

def are_progress_bars_disabled() -> bool:
    global _hf_datasets_progress_bars_disabled
    return _hf_datasets_progress_bars_disabled

is_progress_bar_enabled

Returns whether progress bars are globally enabled. This is the logical inverse of are_progress_bars_disabled().

def is_progress_bar_enabled():
    return not are_progress_bars_disabled()

Backward Compatibility Aliases

The module provides backward-compatible singular aliases:

enable_progress_bar = enable_progress_bars
disable_progress_bar = disable_progress_bars

Global State

The module uses a module-level boolean _hf_datasets_progress_bars_disabled to track the global state. It is initialized from the HF_DATASETS_DISABLE_PROGRESS_BARS config value, defaulting to False (progress bars enabled).

Priority rules:

  • The environment variable HF_DATASETS_DISABLE_PROGRESS_BARS has priority over programmatic control.
  • If the environment variable is set, calls to enable_progress_bars() or disable_progress_bars() that conflict with the environment variable will emit a warning and have no effect.
  • If the environment variable is not set (None), programmatic control works as expected.

Dependencies

Dependency Type Purpose
os Standard library Reading the TQDM_POSITION environment variable
warnings Standard library Emitting warnings when environment variable conflicts with programmatic control
tqdm.auto External Base progress bar class
datasets.config Internal Reading HF_DATASETS_DISABLE_PROGRESS_BARS configuration

Usage Example

from datasets.utils.tqdm import tqdm, disable_progress_bars, enable_progress_bars, are_progress_bars_disabled

# Disable progress bars globally
disable_progress_bars()
print(are_progress_bars_disabled())  # True

# Progress bar will not be shown even with disable=False
for _ in tqdm(range(5), disable=False):
    pass

# Re-enable progress bars
enable_progress_bars()
print(are_progress_bars_disabled())  # False

# Progress bar will now be shown
for _ in tqdm(range(5)):
    pass

Design Notes

  • The tqdm subclass approach ensures that all progress bars created through this module respect the global disable state, without requiring every call site to check the state manually.
  • The environment variable override provides a way for deployment environments to control progress bar behavior without code changes, which is useful in CI/CD pipelines and non-interactive contexts.
  • The TQDM_POSITION="-1" check is a heuristic for cloud environments where tqdm auto-detection might incorrectly disable progress bars.
  • The __delattr__ fix for the _lock attribute addresses a race condition or cleanup issue specific to the interaction between tqdm and the datasets library.

File Location

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment