Implementation:Huggingface Datasets Dataset Rename Column
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, ML_Preprocessing |
| Last Updated | 2026-02-14 18:00 GMT |
Overview
Concrete tool for renaming a column in a dataset provided by the HuggingFace Datasets library.
Description
The rename_column method creates a copy of the dataset with one column renamed. It updates both the Arrow table column names and the Features metadata. The operation validates that the original column exists and that the new name does not conflict with existing column names. If the dataset has format columns configured, they are also updated to reflect the rename.
Usage
Use Dataset.rename_column when you need to change the name of a single column to match the naming convention expected by a model, Trainer, or downstream pipeline step.
Code Reference
Source Location
- Repository: datasets
- File:
src/datasets/arrow_dataset.py - Lines: L2263-L2327
Signature
@fingerprint_transform(inplace=False)
def rename_column(
self,
original_column_name: str,
new_column_name: str,
new_fingerprint: Optional[str] = None,
) -> "Dataset":
Import
from datasets import load_dataset
ds = load_dataset("cornell-movie-review-data/rotten_tomatoes", split="validation")
ds = ds.rename_column("label", "label_new")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| original_column_name | str |
Yes | Name of the column to rename. Must exist in the dataset. |
| new_column_name | str |
Yes | New name for the column. Must not already exist in the dataset and must not be empty. |
| new_fingerprint | Optional[str] |
No | The new fingerprint of the dataset after transform. If None, computed automatically.
|
Outputs
| Name | Type | Description |
|---|---|---|
| return | Dataset |
A copy of the dataset with the specified column renamed. |
Usage Examples
Basic Usage
from datasets import load_dataset
ds = load_dataset("cornell-movie-review-data/rotten_tomatoes", split="validation")
ds = ds.rename_column("label", "label_new")
print(ds.column_names)
# ['text', 'label_new']