Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Datasets ClassLabel

From Leeroopedia
Knowledge Sources
Domains Data_Engineering, NLP
Last Updated 2026-02-14 18:00 GMT

Overview

Concrete tool for encoding categorical class labels as integers with name mappings provided by the HuggingFace Datasets library.

Description

ClassLabel is a dataclass feature type for integer class labels. There are three ways to define a ClassLabel: by providing num_classes (creates labels "0" to "num_classes-1"), by providing a list of names, or by providing a names_file (one label per line). Under the hood, labels are stored as int64 Arrow values. Bidirectional conversion is provided via str2int() and int2str() methods. Negative integers represent unknown/missing labels. The cast_storage method can convert both string and integer Arrow arrays to the ClassLabel storage type.

Usage

Use ClassLabel to define label columns in classification datasets. It is the standard feature type for sentiment labels, category tags, entity types, and any finite set of classes.

Code Reference

Source Location

  • Repository: datasets
  • File: src/datasets/features/features.py
  • Lines: 982-1177

Signature

@dataclass
class ClassLabel:
    num_classes: InitVar[Optional[int]] = None
    names: list[str] = None
    names_file: InitVar[Optional[str]] = None
    id: Optional[str] = field(default=None, repr=False)
    # Automatically constructed
    dtype: ClassVar[str] = "int64"
    pa_type: ClassVar[Any] = pa.int64()
    _str2int: ClassVar[dict[str, int]] = None
    _int2str: ClassVar[dict[int, int]] = None
    _type: str = field(default="ClassLabel", init=False, repr=False)

Import

from datasets import ClassLabel

I/O Contract

Inputs

Name Type Required Description
num_classes int No Number of classes. All labels must be < num_classes. Mutually exclusive with names/names_file.
names list[str] No List of string label names. Order is preserved.
names_file str No Path to a file with one label name per line.
id str No Optional feature identifier.

Outputs

Name Type Description
instance ClassLabel A ClassLabel feature with bidirectional str-to-int mapping.

Usage Examples

Basic Usage

from datasets import Features, ClassLabel

features = Features({
    "label": ClassLabel(num_classes=3, names=["bad", "ok", "good"]),
})
print(features)
# {'label': ClassLabel(names=['bad', 'ok', 'good'])}

# Convert between strings and integers
label_feature = features["label"]
print(label_feature.str2int("good"))  # 2
print(label_feature.int2str(0))       # 'bad'

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment