Implementation:Rapidsai Cuml Clang Tidy Runner
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Code_Quality |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
A Python script that runs clang-tidy static analysis on the cuML C++ codebase, handling CUDA-specific compilation flags and parallelized execution.
Description
run-clang-tidy.py orchestrates clang-tidy execution across all compilation units in the cuML C++ project. It reads the CMake-generated compile_commands.json database, transforms compiler commands from nvcc/g++ format to clang-compatible format, and runs clang-tidy in parallel.
Key features include:
- Version Enforcement -- Requires clang-tidy version 20.1.8 exactly, ensuring consistent analysis results across environments.
- Configuration File Support -- Reads additional settings from a TOML configuration file (default:
pyproject.tomlunder the[tool.run-clang-tidy]section). - Command Translation -- Converts nvcc compilation commands to clang-compatible form by:
- Replacing compiler invocations (
c++toclang-cpp,cctoclang) - Converting
-gencodeflags to--cuda-gpu-archflags - Replacing CUDA language flags (
-x cuto-x cuda) - Removing nvcc-specific flags (
--expt-extended-lambda,--diag_suppress,-ccbin) - Adding clang include directories
- Replacing compiler invocations (
- CUDA Handling -- For
.cufiles, runs clang-tidy twice: once with--cuda-device-onlyand once with--cuda-host-onlyto analyze both device and host code paths. - Parallel Execution -- Uses Python multiprocessing to run analysis on multiple files simultaneously, with configurable job count (
-jflag, defaulting to CPU count). - File Filtering -- Supports regex-based file selection (
-select) and ignore patterns (-ignore, defaulting to[.]cu$).
The header filter is set to match files under cuml/cpp/(src|include|bench|comms).
Usage
Run this script from the repository root after a successful CMake configuration to perform static analysis on the C++ codebase. It is used in CI via the ci/run_clang_tidy.sh script.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
cpp/scripts/run-clang-tidy.py
Signature
def main():
def parse_args():
def get_all_commands(cdb):
def get_tidy_args(cmd, exe):
def run_clang_tidy(cmd, args):
def run_tidy_for_all_files(args, all_files):
def run_clang_tidy_command(tidy_cmd):
def get_clang_includes(exe):
def get_gpu_archs(command):
Import
# Run from the repository root:
python cpp/scripts/run-clang-tidy.py [-cdb PATH] [-exe PATH] [-ignore REGEX] [-select REGEX] [-j N]
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| -cdb | string | No | Path to compile_commands.json (default: cpp/build/compile_commands.json)
|
| -exe | string | No | Path to clang-tidy executable (default: clang-tidy)
|
| -ignore | string | No | Regex pattern for files to ignore (default: [.]cu$)
|
| -select | string | No | Regex pattern for files to select (default: all files) |
| -j | int | No | Number of parallel jobs (default: CPU count) |
| -c / --config | string | No | Path to TOML config file (default: pyproject.toml)
|
Outputs
| Name | Type | Description |
|---|---|---|
| PASSED/FAILED per file | stdout | Status and diagnostics for each analyzed file |
| Exit code | int | 0 on success, exception raised on failure |
Usage Examples
# Run clang-tidy on all C++ files (excluding .cu files by default)
python cpp/scripts/run-clang-tidy.py
# Run on specific files with 4 parallel jobs
python cpp/scripts/run-clang-tidy.py -j 4 -select "svm"
# Use a custom compilation database path
python cpp/scripts/run-clang-tidy.py -cdb cpp/build_debug/compile_commands.json