Implementation:ClickHouse ClickHouse JSON Parser
| Knowledge Sources | |
|---|---|
| Domains | Parsing, JSON |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
A lightweight, zero-copy JSON parser that represents a view over JSON data without allocating memory or building a full parse tree.
Description
This code implements a minimal JSON parser that operates on raw memory buffers containing JSON data. Unlike traditional JSON parsers that build complete DOM trees, this parser provides a reference-based view that parses only what is needed when methods are called. The `JSON` class represents a pointer to a JSON fragment and provides methods to navigate and extract values lazily. It supports truncated JSON, works with non-zero-terminated strings, and can extract elements from arrays and objects without parsing the entire structure. The implementation is optimized for extracting a few values from many small JSON documents.
Usage
Use this when you need to extract specific values from large numbers of small JSON documents (like visit parameters or event logs) where building a full DOM would be wasteful. Ideal for high-throughput scenarios where you only need a few fields from each JSON object. Not optimal if you need to perform extensive processing on a single large JSON document.
Code Reference
Source Location
- Repository: ClickHouse
- File: base/base/JSON.h and base/base/JSON.cpp
- Lines: 1-208 (header), 1-818 (implementation)
Signature
class JSON
{
public:
JSON(const char * ptr_begin, const char * ptr_end, unsigned level = 0);
explicit JSON(std::string_view s);
enum ElementType {
TYPE_OBJECT, TYPE_ARRAY, TYPE_NUMBER, TYPE_STRING,
TYPE_BOOL, TYPE_NULL, TYPE_NAME_VALUE_PAIR, TYPE_NOTYPE
};
ElementType getType() const;
bool isObject() const;
bool isArray() const;
bool isNumber() const;
bool isString() const;
size_t size() const;
bool empty() const;
JSON operator[](size_t n) const; // Array access
JSON operator[](const std::string & name) const; // Object access
bool has(const std::string & name) const;
double getDouble() const;
Int64 getInt() const;
UInt64 getUInt() const;
std::string getString() const;
bool getBool() const;
std::string toString() const;
// Iterator support
iterator begin() const;
iterator end() const;
iterator & operator++();
};
Import
#include <base/JSON.h>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| ptr_begin | const char * | Yes | Start of JSON data |
| ptr_end | const char * | Yes | End of JSON data (non-inclusive) |
| level | unsigned | No | Recursion depth (default: 0) |
| field_name | std::string | No | Object field name for lookups |
| index | size_t | No | Array index for element access |
Outputs
| Name | Type | Description |
|---|---|---|
| JSON | JSON | A view into a JSON fragment |
| value | double/Int64/UInt64/std::string/bool | Extracted primitive value |
| size | size_t | Number of elements in array/object |
| type | ElementType | The type of the JSON element |
Usage Examples
// Parse JSON from string view
std::string_view json_data = R"({"name": "Alice", "age": 30, "scores": [95, 87, 92]})";
JSON json(json_data);
// Check type and extract values
if (json.isObject()) {
std::string name = json["name"].getString(); // "Alice"
Int64 age = json["age"].getInt(); // 30
// Check if field exists
if (json.has("email")) {
std::string email = json["email"].getString();
}
}
// Navigate arrays
JSON scores = json["scores"];
if (scores.isArray()) {
for (size_t i = 0; i < scores.size(); ++i) {
Int64 score = scores[i].getInt();
}
}
// Iterate over object fields
for (auto it = json.begin(); it != json.end(); ++it) {
std::string key = it->getName();
JSON value = it->getValue();
}
// Work with truncated JSON (only parses what's accessed)
std::string_view partial = R"({"a":1,"b":2)"; // Missing closing brace
JSON partial_json(partial);
Int64 a_value = partial_json["a"].getInt(); // Works fine
// Handle with default values
Int64 age_with_default = json.getWithDefault("missing_field", 25);