Heuristic:ClickHouse ClickHouse Banned Functions Thread Safety
| Knowledge Sources | |
|---|---|
| Domains | Thread_Safety, Code_Quality |
| Last Updated | 2026-02-08 18:00 GMT |
Overview
ClickHouse traps ~200 non-thread-safe libc functions at runtime in debug and sanitizer builds, forcing developers to use thread-safe alternatives (`_r` variants) or C++ standard library facilities.
Description
The `harmful` library (`base/harmful/harmful.c`) defines `TRAP` macros for approximately 200 unsafe libc functions. In debug or sanitizer builds, calling any trapped function writes the function name to stderr and triggers `__builtin_trap()`, immediately terminating the process. This aggressive approach prevents subtle concurrency bugs from entering the codebase. The library specifically targets non-thread-safe functions (those with `_r` thread-safe variants), global-state-modifying functions, and functions with non-reentrant internal buffers. Some functions are explicitly exempted with comments explaining why.
Usage
Apply this heuristic when writing new C++ code in ClickHouse or debugging crashes in debug/sanitizer builds. If you see a trap on a function name, find its `_r` variant or C++ equivalent. The trapped functions are only enforced on Linux AMD64 in debug/sanitizer builds, not in release builds.
The Insight (Rule of Thumb)
- Action: Always use thread-safe alternatives for libc functions. Check `base/harmful/harmful.c` when unsure.
- Value: Prevents data races from non-reentrant functions that use global static buffers.
- Key replacements:
- `strtok` -> `strtok_r`
- `gmtime` / `localtime` -> `gmtime_r` / `localtime_r`
- `getpwuid` / `getpwnam` -> `getpwuid_r` / `getpwnam_r`
- `rand` / `srand` -> `std::mt19937` or `random_r`
- `strerror` -> `strerror_r` (exception: RocksDB uses `strerror`, exempted)
- `tmpnam` -> `mkstemp`
- `ctime` -> `ctime_r` or `strftime` with `localtime_r`
- Trade-off: The trap only fires in debug/sanitizer builds. Release builds link without the `harmful` library, so violations pass silently in production but may cause data races.
- Exemptions: Some functions are commented out with justification (e.g., `exit`, `setenv` ok at startup, `readdir` thread-safe in modern glibc, `dlerror` needed by TSan).
Reasoning
TRAP macro definition from `base/harmful/harmful.c:14-15`:
long write(int, const void *, unsigned long);
#define TRAP(func) void func() { write(2, #func "\n", __builtin_strlen(#func) + 1); __builtin_trap(); }
Discovery method from `base/harmful/harmful.c:17-19`:
/// Trap all non thread-safe functions:
/// nm -D /lib/x86_64-linux-gnu/{libc.so.6,...} | grep -P '_r@?$' | awk '{ print $3 }'
/// | sed -r -e 's/_r//' | grep -vP '^_'
Exemption example from `base/harmful/harmful.c:164`:
/// strerror is not thread-safe, but it is used by RocksDB, so we can't trap it.
// TRAP(strerror)
Platform limitation from `programs/CMakeLists.txt:135-137`:
if (ARCH_AMD64 AND OS_LINUX AND NOT OS_ANDROID)
set (HARMFUL_LIB harmful)
endif ()
Musl compatibility from `base/harmful/harmful.c:289-298`:
#ifndef USE_MUSL
/// These produce duplicate symbol errors when statically linking with musl.
TRAP(getopt)
TRAP(putenv)
TRAP(rand)
#endif