Implementation:ClickHouse ClickHouse Update Submodules Script
| Knowledge Sources | |
|---|---|
| Domains | Build_System, Version_Control |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for initializing and populating all Git submodules in the ClickHouse repository provided by the contrib/update-submodules.sh shell script.
Description
The update-submodules.sh script performs a complete initialization of all vendored third-party dependencies in the ClickHouse source tree. It does three things in sequence:
- Initializes and synchronizes submodule metadata by running
git submodule initandgit submodule sync, which registers all submodule paths from.gitmodulesand ensures remote URLs are up to date. - Fetches submodule contents in parallel using
xargs --max-procsto invokegit submodule update --depth=1 --single-branchfor each submodule. The--depth=1flag performs a shallow clone (single commit), and--single-branchavoids fetching unnecessary branch refs. - Deletes third-party CMake files from all submodule directories to prevent conflicts with ClickHouse's custom CMake wrappers. Three submodules are exempted from deletion:
llvm-project(used to build LLVM),corrosion(used for Rust integration), andrust_vendor(Rust dependency crates). Files matching*.h.cmakeare also preserved as they are template header files, not build system files.
The script is designed to be run from any directory within the repository; it resolves its own location and navigates to the repository root automatically.
Usage
Run this script after cloning the ClickHouse repository and before invoking CMake. It is also useful when switching branches that change submodule commit references, or after a git pull that updates .gitmodules.
Code Reference
Source Location
- Repository: ClickHouse
- File:
contrib/update-submodules.sh - Lines: 1-44
Signature
./contrib/update-submodules.sh [--max-procs NUM]
Import
# Run from any directory within the ClickHouse repository:
./contrib/update-submodules.sh
# Or with a custom parallelism level:
./contrib/update-submodules.sh --max-procs 16
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
--max-procs |
Integer | No (default: 64) | Maximum number of parallel git submodule update processes run via xargs
|
.gitmodules |
File | Yes | Git submodules configuration file in the repository root, listing all submodule paths and URLs |
| Cloned repository | Directory | Yes | A Git clone of the ClickHouse repository (shallow or full) |
Outputs
| Name | Type | Description |
|---|---|---|
contrib/*/ |
Directories | Populated submodule directories containing third-party source code at their pinned commit revisions |
| Cleaned CMake files | Side effect | All CMakeLists.txt and *.cmake files removed from submodule directories (except llvm-project, corrosion, rust_vendor, and *.h.cmake files)
|
Usage Examples
Basic Submodule Initialization
git clone https://github.com/ClickHouse/ClickHouse.git
cd ClickHouse
./contrib/update-submodules.sh
Reduced Parallelism for Resource-Constrained Environments
# Use only 8 parallel fetches (e.g., on a low-bandwidth connection)
./contrib/update-submodules.sh --max-procs 8
Re-synchronize After Branch Switch
git checkout release/26.2
./contrib/update-submodules.sh