Implementation:Huggingface Datasets Dataset From List
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, NLP |
| Last Updated | 2026-02-14 18:00 GMT |
Overview
Concrete tool for creating a Dataset from a list of dictionaries provided by the HuggingFace Datasets library.
Description
Dataset.from_list is a class method that converts a list of dictionaries into a Dataset. Each dictionary in the list represents one row, with keys as column names and values as cell values. The keys of the first dictionary determine the dataset columns. Internally, the list is transposed into a columnar dictionary and then delegated to Dataset.from_dict. The resulting dataset lives in memory without an associated cache directory.
Usage
Use Dataset.from_list when your data is organized as a list of records (row-oriented dictionaries), which is common when reading JSON arrays, collecting API responses, or building datasets from query results.
Code Reference
Source Location
- Repository: datasets
- File:
src/datasets/arrow_dataset.py - Lines: 1037-1067
Signature
@classmethod
def from_list(
cls,
mapping: list[dict],
features: Optional[Features] = None,
info: Optional[DatasetInfo] = None,
split: Optional[NamedSplit] = None,
) -> "Dataset":
Import
from datasets import Dataset
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| mapping | list[dict] |
Yes | A list of dictionaries mapping column names to row values. |
| features | Features |
No | Explicit dataset features schema. |
| info | DatasetInfo |
No | Dataset metadata (description, citation, etc.). |
| split | NamedSplit |
No | Name of the dataset split. |
Outputs
| Name | Type | Description |
|---|---|---|
| return | Dataset |
A new in-memory Dataset built from the list of dictionaries. |
Usage Examples
Basic Usage
from datasets import Dataset
data = [
{"text": "Hello world", "label": 1},
{"text": "Goodbye world", "label": 0},
]
ds = Dataset.from_list(data)
print(ds)
# Dataset({
# features: ['text', 'label'],
# num_rows: 2
# })