Principle:ClickHouse ClickHouse Base Library Compilation
| Knowledge Sources | |
|---|---|
| Domains | Build_System, C++, Systems_Programming |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Foundation utility libraries form the base layer of a database engine, providing core type definitions, SIMD operations, memory management primitives, glibc compatibility shims, and fundamental networking/utility frameworks upon which all higher-level components depend.
Description
ClickHouse's base layer consists of two key libraries compiled from the base/ directory, plus the vendored Poco framework:
1. The common library (defined in base/base/CMakeLists.txt) contains 26 source files that provide the lowest-level utilities used throughout the codebase. These include:
- Hashing and encoding: Integration with CityHash for fast non-cryptographic hashing.
- String and number conversion: Custom
itoaimplementation,int8_to_string, wide integer to string conversion. - System information: Functions for getting available memory, page size, thread ID, FQDN, and cgroup v2 information.
- Terminal and debugging: Terminal color support, demangling of C++ symbols, DWARF-based stack traces.
- Memory management: Custom
mremapimplementation, PHDR cache for fastdl_iterate_phdr. - Miscellaneous: Decimal type support, precise
exp10andshift10math, safe process exit, NUMA awareness.
The common library links against several vendored third-party libraries: CityHash, Boost (headers and system), Poco (Net, SSL, Util, Foundation), replxx (line editing), cctz (time zones), fmt (string formatting), and magic_enum (enum reflection).
2. The glibc-compatibility library (defined in base/glibc-compatibility/CMakeLists.txt) provides musl-derived shims that replace newer glibc symbols with portable implementations. This allows ClickHouse binaries compiled on a modern system to run on older Linux distributions with older glibc versions. The library is only built when the GLIBC_COMPATIBILITY option is enabled (which is the default on Linux amd64 and aarch64).
The glibc-compatibility library includes:
- musl libc implementations of various system functions.
- Architecture-specific assembly for syscalls and longjmp (separate implementations for amd64 and aarch64).
- Frame pointer omission (
-fomit-frame-pointer) to match glibc's performance characteristics.
3. The Poco framework is compiled as 7 separate components (Foundation, XML, JSON, Util, Net, Net::SSL, and Crypto), providing ClickHouse with HTTP server/client capabilities, XML parsing, configuration management, and SSL/TLS support.
Usage
These base libraries are used when:
- Building a database engine that requires portable binaries across Linux distributions.
- Implementing system-level utilities (memory introspection, thread management, signal handling) that must work consistently across platforms.
- Providing a stable foundation layer that higher-level components (query processing, storage engines, networking) can depend on without pulling in the full application logic.
The common library is linked by virtually every other target in the ClickHouse build. The glibc-compatibility library is linked via the global-libs interface target, making it implicitly available to all executables.
Theoretical Basis
The base library compilation follows a layered architecture pattern:
Layer 3: Application (programs/server, programs/client, etc.)
Layer 2: Database Engine (src/: Storages, Interpreters, Processors, etc.)
Layer 1: Common I/O (src/Common/, src/IO/)
Layer 0: Base Libraries (base/base/common, base/glibc-compatibility, Poco)
+ Third-party (contrib/: ~90 vendored libraries)
The glibc-compatibility approach works by providing alternative implementations of functions that were introduced in newer versions of glibc. When statically linked, these implementations are used instead of the glibc versions, allowing the binary to run on systems with older glibc. Key techniques include:
- Symbol interposition: Providing implementations of functions like
__cxa_thread_atexit_implthat may not exist in older glibc. - Architecture-specific assembly: Optimized syscall wrappers and setjmp/longjmp implementations for amd64 and aarch64.
- Conditional compilation: The library is only built for supported architectures (amd64, aarch64) on Linux.