Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:ClickHouse ClickHouse Code Review Process

From Leeroopedia


Knowledge Sources
Domains Development_Process, Code_Quality
Last Updated 2026-02-08 00:00 GMT

Overview

Structured code review with severity-based finding classification ensures that pull requests are evaluated systematically using a prioritized risk checklist covering memory safety, concurrency, performance, and user-facing changes, with a strict convention of adding new commits (never rebasing or amending) to address feedback.

Description

The ClickHouse code review process is a formal, multi-dimensional evaluation of every pull request before merge. It is designed for a large C++ codebase where bugs in core areas (query execution, storage engines, replication, MergeTree internals) can cause data corruption, data loss, or security vulnerabilities. The review process goes beyond surface-level inspection by employing a structured risk checklist and a severity model that classifies findings into actionable categories.

The review process has several distinguishing characteristics:

1. Priority-ordered evaluation: Reviewers assess changes in a strict priority order:

  1. Correctness and safety: Logic errors, data corruption, missing checks, undefined behavior.
  2. Resource management: Memory leaks, file descriptor leaks, lifetime issues, ownership confusion.
  3. Concurrency and robustness: Data races, deadlocks, misuse of atomics/locks, unsafe shared state.
  4. Performance characteristics: Hot-path regressions, pathological complexity, unbounded allocations.
  5. Maintainability and simplicity: Over-engineering, duplicated logic, fragile patterns.
  6. User-facing quality: Wrong or misleading messages, missing observability for failure modes.
  7. ClickHouse-specific compliance: Deletion logging, serialization versioning, backward compatibility, experimental gates.

2. Risk checklist: Reviewers systematically scan diffs for specific classes of bugs across six risk categories: memory and lifetime issues, resource management problems, concurrency and threading hazards, error handling gaps, data correctness and serialization concerns, and performance and algorithmic behavior problems.

3. Severity-based finding classification: Each finding is classified into one of three severity levels that determine the required action:

  • Blockers: Must be fixed before merge. Include incorrectness, data loss, memory/resource leaks, undefined behavior, new races or deadlocks, missing serialization versioning, unlogged deletion events, missing experimental gates, significant performance regressions, and security issues.
  • Majors: Serious but not catastrophic. Include under-tested edge cases, fragile code, hidden magic constants, confusing user-visible behavior, and missing comments in complex logic.
  • Nits: Only mentioned if they materially improve robustness or clarity. Minor refactors that reduce bug risk or documentation improvements that prevent user confusion.

4. High-precision feedback: The review philosophy explicitly prioritizes precision over recall. False positives (flagging non-issues) are considered worse than missed nits. Reviewers are expected to refrain from commenting on style, typos, pure formatting, or micro-optimizations unless they indicate actual bugs. This keeps review feedback focused and actionable.

5. New commits only: A firm project convention requires that all review feedback be addressed by adding new commits to the branch. Contributors must never rebase or amend existing commits. This preserves the complete review history, makes it easy to see what changed between review rounds, and avoids the risk of lost work from destructive git operations.

6. CLA requirement: Every contributor must have signed the ClickHouse Individual Contributor License Agreement before their pull request can be merged. This is enforced automatically by a bot on the first pull request from each contributor.

Usage

This principle applies to all pull requests to the ClickHouse repository:

  • Reviewers: Follow the priority-ordered evaluation, apply the risk checklist, and classify findings by severity. Approve only when there are no Blockers or Majors. Include the structured output format: Summary, Missing Context, Findings, Tests and Evidence, Compliance Checklist, Performance and Safety Notes, User-Lens Review, and Final Verdict.
  • Contributors: Address review feedback by pushing new commits (never rebase or amend). Respond to Blockers and Majors before requesting re-review. Nits may be addressed at the contributor's discretion.
  • Maintainers: Apply stricter scrutiny to changes in core areas (query execution, storage engines, replication, Keeper/coordination, system tables, MergeTree internals). Verify ClickHouse-specific compliance requirements.

Theoretical Basis

The ClickHouse code review process is grounded in several established software engineering principles adapted for the specific needs of a high-performance, safety-critical database engine.

1. Severity-based triage: Not all code issues carry equal risk. By classifying findings into Blockers, Majors, and Nits, the review process creates a clear contract between reviewer and contributor:

  • Blockers represent defects that, if shipped, could cause data loss, corruption, crashes, or security vulnerabilities. These must be fixed before merge, full stop.
  • Majors represent fragility or incompleteness that, while not immediately dangerous, increases the likelihood of future bugs or user confusion. These should normally be fixed.
  • Nits represent improvements that reduce future risk but are not urgent. The explicit instruction to only mention nits that "materially improve robustness or clarity" prevents review comment inflation.

This triage model ensures that review effort is concentrated on the highest-impact findings.

2. Risk-oriented scanning: Rather than reviewing code linearly from top to bottom, the risk checklist directs reviewers to scan for specific bug patterns. This is analogous to a security audit approach where known vulnerability classes guide the inspection. The six risk categories (memory/lifetime, resource management, concurrency, error handling, data correctness, performance) cover the most common and dangerous bug classes in C++ database engine code.

3. Precision over recall: The explicit statement that "false positives are worse than missed nits" reflects a deliberate design choice. In open-source projects with many contributors, noisy reviews create friction and discourage contribution. By requiring reviewers to be "reasonably confident" before flagging an issue, the process maintains trust between reviewers and contributors.

4. Immutable commit history: The convention of never rebasing or amending serves several purposes:

  • Auditability: The complete sequence of changes, including review-driven corrections, is preserved in the git history.
  • Safety: Rebase and amend are destructive operations that can lose work or inadvertently alter previously reviewed code.
  • Review efficiency: Reviewers can compare successive commits to see exactly what changed in response to feedback, rather than re-reviewing the entire diff.

5. ClickHouse-specific compliance gates: The review checklist includes domain-specific requirements that reflect hard-won lessons from operating a distributed database at scale:

  • Deletion logging: All data deletion events must be logged to enable post-incident forensics.
  • Serialization versioning: Any format change must be versioned to ensure upgrade/downgrade compatibility across cluster nodes.
  • Experimental gates: New features must be behind experimental settings to enable incremental rollout and safe rollback.
  • Backward compatibility: New versions must be configurable to behave like older versions via `compatibility` settings.
  • Settings exposure: Important thresholds and alternative behaviors must be exposed as settings rather than hardcoded magic constants.

These compliance gates transform code review from a purely technical exercise into a holistic quality assurance process that considers operational, compatibility, and user-facing concerns.

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment