Implementation:Unstructured IO Unstructured Elements To Json
| Knowledge Sources | |
|---|---|
| Domains | Document_Processing, Data_Serialization |
| Last Updated | 2026-02-12 00:00 GMT |
Overview
Concrete tool for serializing document elements to JSON format provided by the Unstructured library.
Description
⚠️ DEPRECATION NOTICE: The module unstructured.staging.base is deprecated as a location. These serde functions will likely relocate to unstructured.documents.elements in a future release. See Heuristic:Unstructured_IO_Unstructured_Warning_Deprecated_Staging_Base.
The elements_to_json function converts a list of Element objects into a JSON string representation. It can optionally write the output to a file. The function delegates to elements_to_dicts for the actual conversion, then applies JSON formatting. Related functions include elements_from_json for deserialization and elements_to_dicts for dictionary conversion.
Usage
Import this function when you need to persist partitioned elements to a JSON file or transmit them as a JSON string. It is the standard output serializer used by the unstructured-ingest pipeline and by applications that store partition results.
Code Reference
Source Location
- Repository: unstructured
- File: unstructured/staging/base.py
- Lines: 185-206
Signature
def elements_to_json(
elements: Iterable[Element],
filename: Optional[str] = None,
indent: int = 4,
encoding: str = "utf-8",
) -> str:
"""Serialize elements to JSON string, optionally writing to a file.
Args:
elements: Iterable of Element objects to serialize.
filename: Optional output file path. If provided, writes JSON to this file.
indent: JSON indentation level (default 4).
encoding: File encoding (default utf-8).
Returns:
JSON string of serialized elements.
"""
Import
from unstructured.staging.base import elements_to_json
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| elements | Iterable[Element] | Yes | Elements to serialize |
| filename | None | No | Output file path (writes to file if provided) |
| indent | int | No | JSON indentation (default 4) |
| encoding | str | No | File encoding (default "utf-8") |
Outputs
| Name | Type | Description |
|---|---|---|
| return | str | JSON string representation of all elements |
| file (side effect) | JSON file | If filename is provided, writes JSON to the specified file |
Usage Examples
Serialize to JSON String
from unstructured.partition.auto import partition
from unstructured.staging.base import elements_to_json
elements = partition(filename="report.pdf")
# Get JSON string
json_str = elements_to_json(elements)
print(json_str[:200])
Write to JSON File
from unstructured.partition.auto import partition
from unstructured.staging.base import elements_to_json
elements = partition(filename="report.pdf")
# Write to file and get JSON string
json_str = elements_to_json(elements, filename="output/report.json", indent=2)
Round-Trip Serialization
from unstructured.staging.base import elements_to_json, elements_from_json
# Serialize
elements_to_json(elements, filename="elements.json")
# Deserialize
restored_elements = elements_from_json(filename="elements.json")