Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Rapidsai Cuml Clang Tidy Runner

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Code_Quality
Last Updated 2026-02-08 12:00 GMT

Overview

A Python script that runs clang-tidy static analysis on the cuML C++ codebase, handling CUDA-specific compilation flags and parallelized execution.

Description

run-clang-tidy.py orchestrates clang-tidy execution across all compilation units in the cuML C++ project. It reads the CMake-generated compile_commands.json database, transforms compiler commands from nvcc/g++ format to clang-compatible format, and runs clang-tidy in parallel.

Key features include:

  1. Version Enforcement -- Requires clang-tidy version 20.1.8 exactly, ensuring consistent analysis results across environments.
  2. Configuration File Support -- Reads additional settings from a TOML configuration file (default: pyproject.toml under the [tool.run-clang-tidy] section).
  3. Command Translation -- Converts nvcc compilation commands to clang-compatible form by:
    • Replacing compiler invocations (c++ to clang-cpp, cc to clang)
    • Converting -gencode flags to --cuda-gpu-arch flags
    • Replacing CUDA language flags (-x cu to -x cuda)
    • Removing nvcc-specific flags (--expt-extended-lambda, --diag_suppress, -ccbin)
    • Adding clang include directories
  1. CUDA Handling -- For .cu files, runs clang-tidy twice: once with --cuda-device-only and once with --cuda-host-only to analyze both device and host code paths.
  2. Parallel Execution -- Uses Python multiprocessing to run analysis on multiple files simultaneously, with configurable job count (-j flag, defaulting to CPU count).
  3. File Filtering -- Supports regex-based file selection (-select) and ignore patterns (-ignore, defaulting to [.]cu$).

The header filter is set to match files under cuml/cpp/(src|include|bench|comms).

Usage

Run this script from the repository root after a successful CMake configuration to perform static analysis on the C++ codebase. It is used in CI via the ci/run_clang_tidy.sh script.

Code Reference

Source Location

Signature

def main():
def parse_args():
def get_all_commands(cdb):
def get_tidy_args(cmd, exe):
def run_clang_tidy(cmd, args):
def run_tidy_for_all_files(args, all_files):
def run_clang_tidy_command(tidy_cmd):
def get_clang_includes(exe):
def get_gpu_archs(command):

Import

# Run from the repository root:
python cpp/scripts/run-clang-tidy.py [-cdb PATH] [-exe PATH] [-ignore REGEX] [-select REGEX] [-j N]

I/O Contract

Inputs

Name Type Required Description
-cdb string No Path to compile_commands.json (default: cpp/build/compile_commands.json)
-exe string No Path to clang-tidy executable (default: clang-tidy)
-ignore string No Regex pattern for files to ignore (default: [.]cu$)
-select string No Regex pattern for files to select (default: all files)
-j int No Number of parallel jobs (default: CPU count)
-c / --config string No Path to TOML config file (default: pyproject.toml)

Outputs

Name Type Description
PASSED/FAILED per file stdout Status and diagnostics for each analyzed file
Exit code int 0 on success, exception raised on failure

Usage Examples

# Run clang-tidy on all C++ files (excluding .cu files by default)
python cpp/scripts/run-clang-tidy.py

# Run on specific files with 4 parallel jobs
python cpp/scripts/run-clang-tidy.py -j 4 -select "svm"

# Use a custom compilation database path
python cpp/scripts/run-clang-tidy.py -cdb cpp/build_debug/compile_commands.json

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment