Implementation:Duckdb Duckdb Yyjson
| Knowledge Sources | |
|---|---|
| Domains | JSON_Processing, Third_Party |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
DuckDB embeds yyjson, a high-performance C JSON library by YaoYuan (ibireme), to provide fast JSON parsing, mutation, and serialization within the duckdb_yyjson namespace.
Description
The yyjson integration provides DuckDB with an extremely fast JSON reader and writer. The library distinguishes between two document models:
- Immutable documents (
yyjson_doc,yyjson_val): Created by the reader, these are compact, read-only representations optimized for traversal. Each value occupies 16 bytes. Documents are freed withyyjson_doc_free(). - Mutable documents (
yyjson_mut_doc,yyjson_mut_val): Used for building and modifying JSON. Mutable documents manage their own memory pools and are freed withyyjson_mut_doc_free().
The reader API (yyjson_read, yyjson_read_opts, yyjson_read_file) parses JSON text into immutable documents. It supports configurable flags for relaxed parsing (trailing commas, comments, inf/nan literals, raw number strings, and invalid unicode tolerance). An in-situ mode (YYJSON_READ_INSITU) allows the reader to modify the input buffer directly for faster string handling.
The writer API (yyjson_write_opts, yyjson_write_file, yyjson_mut_write_opts) serializes documents back to JSON strings with options for pretty printing, unicode escaping, slash escaping, and newline-at-end formatting.
The library also supports JSON Pointer operations (yyjson_ptr_xxx, yyjson_mut_ptr_xxx) for targeted value access and modification, as well as JSON Patch and JSON Merge Patch for document-level transformations.
DuckDB's fork includes integration with duckdb/common/fast_mem.hpp for optimized memory operations and wraps the entire library in the duckdb_yyjson namespace to avoid symbol conflicts.
Usage
DuckDB uses yyjson for all JSON-related operations including: parsing JSON columns and JSON file data, extracting values from JSON documents via JSON Pointer paths, constructing JSON output for to_json() and related functions, and serializing internal metadata structures to JSON format. The library is also used in the JSON extension for reading NDJSON (newline-delimited JSON) files using the YYJSON_READ_STOP_WHEN_DONE flag.
Code Reference
Source Location
- Repository: Duckdb_Duckdb
- Files:
- third_party/yyjson/include/yyjson.hpp -- yyjson C++ header with all API declarations (7930 lines)
- third_party/yyjson/yyjson.cpp -- yyjson implementation (9490 lines)
Signature
namespace duckdb_yyjson {
//--- Core type definitions ---
typedef struct yyjson_doc yyjson_doc; // Immutable document (read-only)
typedef struct yyjson_val yyjson_val; // Immutable value (16 bytes each)
typedef struct yyjson_mut_doc yyjson_mut_doc; // Mutable document (read-write)
typedef struct yyjson_mut_val yyjson_mut_val; // Mutable value
//--- Reader API ---
// Read JSON with full options
yyjson_api yyjson_doc *yyjson_read_opts(char *dat,
size_t len,
yyjson_read_flag flg,
const yyjson_alc *alc,
yyjson_read_err *err);
// Read JSON from a file
yyjson_api yyjson_doc *yyjson_read_file(const char *path,
yyjson_read_flag flg,
const yyjson_alc *alc,
yyjson_read_err *err);
// Convenience: read from const string (disables in-situ)
yyjson_api_inline yyjson_doc *yyjson_read(const char *dat,
size_t len,
yyjson_read_flag flg);
//--- Writer API ---
// Write document to JSON string with options
yyjson_api char *yyjson_write_opts(const yyjson_doc *doc,
yyjson_write_flag flg,
const yyjson_alc *alc,
size_t *len,
yyjson_write_err *err);
// Write document to file with options
yyjson_api bool yyjson_write_file(const char *path,
const yyjson_doc *doc,
yyjson_write_flag flg,
const yyjson_alc *alc,
yyjson_write_err *err);
//--- Version ---
uint32_t yyjson_version(void);
} // namespace duckdb_yyjson
Read Flags
| Flag | Value | Description |
|---|---|---|
YYJSON_READ_NOFLAG |
0 | Default RFC 8259 compliant parsing |
YYJSON_READ_INSITU |
1 << 0 | Modify input buffer in-place for faster string handling |
YYJSON_READ_STOP_WHEN_DONE |
1 << 1 | Stop after first complete document (useful for NDJSON) |
YYJSON_READ_ALLOW_TRAILING_COMMAS |
1 << 2 | Allow trailing commas in objects and arrays |
YYJSON_READ_ALLOW_COMMENTS |
1 << 3 | Allow C-style single and multi-line comments |
YYJSON_READ_ALLOW_INF_AND_NAN |
1 << 4 | Allow inf/nan number literals (case-insensitive) |
YYJSON_READ_NUMBER_AS_RAW |
1 << 5 | Read all numbers as raw strings |
YYJSON_READ_ALLOW_INVALID_UNICODE |
1 << 6 | Allow invalid unicode in string values |
Write Flags
| Flag | Value | Description |
|---|---|---|
YYJSON_WRITE_NOFLAG |
0 | Default minified output |
YYJSON_WRITE_PRETTY |
1 << 0 | Pretty print with 4-space indentation |
YYJSON_WRITE_ESCAPE_UNICODE |
1 << 1 | Escape unicode as \uXXXX sequences |
YYJSON_WRITE_ESCAPE_SLASHES |
1 << 2 | Escape forward slashes as \/ |
YYJSON_WRITE_ALLOW_INF_AND_NAN |
1 << 3 | Write inf/nan as literals |
YYJSON_WRITE_INF_AND_NAN_AS_NULL |
1 << 4 | Write inf/nan as null |
YYJSON_WRITE_ALLOW_INVALID_UNICODE |
1 << 5 | Allow invalid unicode in output |
YYJSON_WRITE_PRETTY_TWO_SPACES |
1 << 6 | Pretty print with 2-space indentation (overrides PRETTY) |
YYJSON_WRITE_NEWLINE_AT_END |
1 << 7 | Append newline character at end of output |
Import
#include "yyjson.hpp"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| dat | const char * (or char * for in-situ) |
Yes | UTF-8 encoded JSON data; null-terminator not required |
| len | size_t |
Yes | Length of the JSON data in bytes; must be > 0 |
| flg | yyjson_read_flag |
No | Bitwise OR of read option flags; 0 for RFC 8259 defaults |
| alc | const yyjson_alc * |
No | Custom memory allocator; NULL to use libc default |
| err | yyjson_read_err * |
No | Pointer to receive error details; NULL to ignore errors |
Outputs
| Name | Type | Description |
|---|---|---|
| doc | yyjson_doc * |
Parsed immutable JSON document, or NULL on error; must be freed with yyjson_doc_free()
|
| json_str | char * |
Serialized JSON string from yyjson_write_opts(); must be freed with free() or alc->free()
|
| len (out) | size_t |
Length of the written JSON string in bytes (excluding null-terminator) |
| err (out) | yyjson_read_err / yyjson_write_err |
Error code and message if the operation failed |
Usage Examples
#include "yyjson.hpp"
using namespace duckdb_yyjson;
// --- Reading JSON ---
const char *json = "{\"name\": \"DuckDB\", \"version\": 1}";
yyjson_doc *doc = yyjson_read(json, strlen(json), YYJSON_READ_NOFLAG);
if (doc) {
yyjson_val *root = yyjson_doc_get_root(doc);
yyjson_val *name = yyjson_obj_get(root, "name");
// yyjson_get_str(name) returns "DuckDB"
yyjson_doc_free(doc);
}
// --- Reading with relaxed options (trailing commas + comments) ---
const char *relaxed = "[1, 2, 3, /* trailing */ ]";
yyjson_read_err err;
yyjson_doc *doc2 = yyjson_read_opts(
(char *)relaxed, strlen(relaxed),
YYJSON_READ_ALLOW_TRAILING_COMMAS | YYJSON_READ_ALLOW_COMMENTS,
NULL, &err);
if (!doc2) {
// err.code and err.msg describe the failure
}
// --- Writing JSON ---
size_t len;
char *output = yyjson_write_opts(doc, YYJSON_WRITE_PRETTY, NULL, &len, NULL);
if (output) {
// output contains pretty-printed JSON of length 'len'
free(output);
}