Implementation:Huggingface Datasets Logging Utils
| Knowledge Sources | |
|---|---|
| Domains | Logging, Configuration |
| Last Updated | 2026-02-14 18:00 GMT |
Overview
Library-wide logging configuration with verbosity controls and environment variable support.
Description
This module provides a centralized logging system for the HuggingFace Datasets library. It configures a library root logger (named "datasets") with a StreamHandler at module load time. The default verbosity level is WARNING, but can be overridden via the DATASETS_VERBOSITY environment variable (valid values: "debug", "info", "warning", "error", "critical").
The module exposes several public functions:
get_logger: Returns a named logger within thedatasetshierarchy. This is the primary function used by dataset builders and internal modules to obtain a logger.get_verbosity/set_verbosity: Get or set the root logger's effective level using standardlogginglevel constants.set_verbosity_info,set_verbosity_warning,set_verbosity_debug,set_verbosity_error: Convenience shortcuts for common verbosity levels.disable_propagation/enable_propagation: Control whether log messages propagate to parent loggers. Propagation is disabled by default.
The module also re-exports standard logging level constants (DEBUG, INFO, WARNING, ERROR, CRITICAL, etc.) and progress bar utilities (tqdm, enable_progress_bar, disable_progress_bar, is_progress_bar_enabled) for backward compatibility.
Usage
Use get_logger(__name__) in any module within the datasets library to obtain a properly configured logger. Use the set_verbosity_* functions or the DATASETS_VERBOSITY environment variable to control how much logging output is displayed.
Code Reference
Source Location
- Repository: datasets
- File:
src/datasets/utils/logging.py - Lines: 1-175
Signature
def get_logger(name: Optional[str] = None) -> logging.Logger:
"""Return a logger with the specified name. This function can be used in dataset builders."""
def get_verbosity() -> int:
"""Return the current level for the HuggingFace datasets library's root logger."""
def set_verbosity(verbosity: int) -> None:
"""Set the level for the Hugging Face Datasets library's root logger."""
def set_verbosity_info():
"""Set the level for the root logger to INFO."""
def set_verbosity_warning():
"""Set the level for the root logger to WARNING."""
def set_verbosity_debug():
"""Set the level for the root logger to DEBUG."""
def set_verbosity_error():
"""Set the level for the root logger to ERROR."""
def disable_propagation() -> None:
"""Disable propagation of the library log outputs."""
def enable_propagation() -> None:
"""Enable propagation of the library log outputs."""
Import
from datasets.utils.logging import get_logger, set_verbosity_info
I/O Contract
get_logger
| Name | Type | Required | Description |
|---|---|---|---|
| name | Optional[str] |
No | Logger name. If None, returns the library root logger ("datasets"). Typically set to __name__ for module-level loggers.
|
Returns: logging.Logger -- A logger in the datasets hierarchy.
get_verbosity
Returns: int -- The effective logging level of the library root logger (e.g., logging.WARNING).
set_verbosity
| Name | Type | Required | Description |
|---|---|---|---|
| verbosity | int |
Yes | Logging level constant (e.g., logging.DEBUG, logging.INFO).
|
Environment Variable
| Variable | Valid Values | Description |
|---|---|---|
DATASETS_VERBOSITY |
"debug", "info", "warning", "error", "critical" |
Overrides the default logging level at module initialization time. If set to an invalid value, a warning is logged and the default (WARNING) is used.
|
Usage Examples
Basic Logger Usage in a Dataset Builder
from datasets.utils.logging import get_logger
logger = get_logger(__name__)
logger.info("Processing dataset...")
logger.warning("Missing optional field 'description'")
logger.debug("Detailed debug info: %s", some_variable)
Setting Verbosity Programmatically
import datasets
# Show all info-level messages
datasets.logging.set_verbosity_info()
# Or use the generic setter with a level constant
datasets.logging.set_verbosity(datasets.logging.DEBUG)
# Check current level
level = datasets.logging.get_verbosity()
print(level) # 10 (DEBUG)
Using the Environment Variable
# Set before importing datasets
import os
os.environ["DATASETS_VERBOSITY"] = "info"
import datasets
# Now all INFO and above messages will be displayed
ds = datasets.load_dataset("cornell-movie-review-data/rotten_tomatoes")
Controlling Log Propagation
import datasets
# Enable propagation if you have a custom root logger
datasets.logging.enable_propagation()
# Disable propagation to prevent double logging
datasets.logging.disable_propagation()