Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:ClickHouse ClickHouse Update Submodules Script

From Leeroopedia


Knowledge Sources
Domains Build_System, Version_Control
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for initializing and populating all Git submodules in the ClickHouse repository provided by the contrib/update-submodules.sh shell script.

Description

The update-submodules.sh script performs a complete initialization of all vendored third-party dependencies in the ClickHouse source tree. It does three things in sequence:

  1. Initializes and synchronizes submodule metadata by running git submodule init and git submodule sync, which registers all submodule paths from .gitmodules and ensures remote URLs are up to date.
  2. Fetches submodule contents in parallel using xargs --max-procs to invoke git submodule update --depth=1 --single-branch for each submodule. The --depth=1 flag performs a shallow clone (single commit), and --single-branch avoids fetching unnecessary branch refs.
  3. Deletes third-party CMake files from all submodule directories to prevent conflicts with ClickHouse's custom CMake wrappers. Three submodules are exempted from deletion: llvm-project (used to build LLVM), corrosion (used for Rust integration), and rust_vendor (Rust dependency crates). Files matching *.h.cmake are also preserved as they are template header files, not build system files.

The script is designed to be run from any directory within the repository; it resolves its own location and navigates to the repository root automatically.

Usage

Run this script after cloning the ClickHouse repository and before invoking CMake. It is also useful when switching branches that change submodule commit references, or after a git pull that updates .gitmodules.

Code Reference

Source Location

  • Repository: ClickHouse
  • File: contrib/update-submodules.sh
  • Lines: 1-44

Signature

./contrib/update-submodules.sh [--max-procs NUM]

Import

# Run from any directory within the ClickHouse repository:
./contrib/update-submodules.sh

# Or with a custom parallelism level:
./contrib/update-submodules.sh --max-procs 16

I/O Contract

Inputs

Name Type Required Description
--max-procs Integer No (default: 64) Maximum number of parallel git submodule update processes run via xargs
.gitmodules File Yes Git submodules configuration file in the repository root, listing all submodule paths and URLs
Cloned repository Directory Yes A Git clone of the ClickHouse repository (shallow or full)

Outputs

Name Type Description
contrib/*/ Directories Populated submodule directories containing third-party source code at their pinned commit revisions
Cleaned CMake files Side effect All CMakeLists.txt and *.cmake files removed from submodule directories (except llvm-project, corrosion, rust_vendor, and *.h.cmake files)

Usage Examples

Basic Submodule Initialization

git clone https://github.com/ClickHouse/ClickHouse.git
cd ClickHouse
./contrib/update-submodules.sh

Reduced Parallelism for Resource-Constrained Environments

# Use only 8 parallel fetches (e.g., on a low-bandwidth connection)
./contrib/update-submodules.sh --max-procs 8

Re-synchronize After Branch Switch

git checkout release/26.2
./contrib/update-submodules.sh

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment