Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Huggingface Datasets Audio Video Dependencies

From Leeroopedia
Knowledge Sources
Domains Audio Processing, Video Processing, Media Encoding, Media Decoding
Last Updated 2026-02-14 19:00 GMT

Overview

Description

The Audio/Video Dependencies environment defines the optional packages required to enable audio and video encoding and decoding within the HuggingFace Datasets library. Audio and video support is not included in the base installation; users must install additional dependencies to work with these media types. The primary dependency is torchcodec, which provides both encoding and decoding capabilities for audio data, and decoding capabilities for video data. The torch (PyTorch) library is also required as a foundational dependency.

Usage

Audio and video features are activated by installing the optional [audio] extra. This unlocks the ability to:

  • Encode audio data into dataset-compatible formats via audio.py
  • Decode audio data from stored representations via audio.py and _torchcodec.py
  • Decode video data from stored representations via video.py

The library checks for the availability of torchcodec at runtime using importlib.util.find_spec("torchcodec") and stores the result in the TORCHCODEC_AVAILABLE flag defined in config.py.

System Requirements

  • Python: Compatible with the Python versions supported by HuggingFace Datasets
  • Operating System: Linux, macOS, or Windows (subject to PyTorch and torchcodec platform support)
  • Hardware: A PyTorch-compatible environment; GPU is optional but may accelerate encoding/decoding operations

Dependencies

Package Minimum Version Purpose Required By
torchcodec 0.6.0 Audio encoding/decoding, video decoding audio.py, video.py, _torchcodec.py
torch (PyTorch) 2.8.0 Tensor operations, backend for torchcodec _torchcodec.py, audio.py, video.py
numpy (transitive) Array operations used by torchcodec internals _torchcodec.py

As defined in setup.py:

AUDIO_REQUIRE = ["torchcodec>=0.6.0", "torch>=2.8.0"]

Credentials

No credentials are required to install or use the audio/video dependencies. All packages are available from public PyPI repositories.

Quick Install

Install the audio/video extras with pip:

pip install datasets[audio]

Or install the dependencies directly:

pip install "torchcodec>=0.6.0" "torch>=2.8.0"

Code Evidence

Runtime availability check in config.py:

TORCHCODEC_AVAILABLE = importlib.util.find_spec("torchcodec") is not None

Audio encoding guard in audio.py:

# Raises an error when torchcodec is not installed
"To support encoding audio data, please install 'torchcodec'."

Audio decoding guard in audio.py:

# Raises an error when torchcodec is not installed
"To support decoding audio data, please install 'torchcodec'."

Video decoding guard in video.py:

# Raises an error when torchcodec is not installed
"To support decoding videos, please install 'torchcodec'."

Direct imports in _torchcodec.py:

from torchcodec.decoders import AudioDecoder
import numpy
import torch

Common Errors

Error Message Cause Resolution
To support encoding audio data, please install 'torchcodec'. torchcodec is not installed and audio encoding was attempted Run pip install "torchcodec>=0.6.0" "torch>=2.8.0"
To support decoding audio data, please install 'torchcodec'. torchcodec is not installed and audio decoding was attempted Run pip install "torchcodec>=0.6.0" "torch>=2.8.0"
To support decoding videos, please install 'torchcodec'. torchcodec is not installed and video decoding was attempted Run pip install "torchcodec>=0.6.0" "torch>=2.8.0"
ModuleNotFoundError: No module named 'torchcodec' torchcodec package is missing from the environment Run pip install "torchcodec>=0.6.0"
ModuleNotFoundError: No module named 'torch' PyTorch is missing from the environment Run pip install "torch>=2.8.0"

Compatibility Notes

  • The torchcodec >= 0.6.0 requirement indicates that earlier versions of torchcodec lack the APIs used by the datasets library (e.g., AudioDecoder).
  • The torch >= 2.8.0 requirement ensures compatibility with torchcodec 0.6.0 and above.
  • The TORCHCODEC_AVAILABLE flag in config.py allows the library to gracefully degrade when these optional dependencies are absent, rather than failing at import time.
  • Audio and video features share the same dependency set, so installing for one automatically enables the other.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment