Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:InternLM Lmdeploy Gemm TunerParams

From Leeroopedia
Revision as of 15:15, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/InternLM_Lmdeploy_Gemm_TunerParams.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains GPU_Kernels, GEMM
Last Updated 2026-02-07 15:00 GMT

Overview

Defines the TuningParams structure and parsing/generation functions for configuring the GEMM autotuner's search strategy, including split-K limits, swizzle factors, and hierarchical batch-size tuning sequences.

Description

This module (header + implementation) manages the tuning parameter space:

TuningParams struct:

  • max_splits: Maximum split-K factor (default 8)
  • max_waves: Maximum wave count (default 10)
  • swizzle: Allowed swizzle factors (default {0, 3})
  • top_k: Top-K selection for hierarchical kernel sampling
  • clusters: Number of kernel clusters (default 5)
  • min_iter / max_iter: Iteration bounds for measurement
  • max_time: Time budget in seconds (default 1.0)
  • seq: Batch-size tuning sequence

Parsing (params.cc):

  • ParseTuningParams: Parses a comma-separated key=value string (e.g., "max_splits=8,top_k=10")
  • ParseTuningSequence: Parses a tuning sequence from triplet generators (e.g., "16-16-128,256-128-1024,8192") where each triplet start-next-step defines a geometric progression
  • GenerateTuningSequence: Expands generator triplets into a flat sequence of batch sizes, with a sentinel max value
  • GetDefaultTuningGenerators: Returns default generators {{8,16,8}, {16,64,16}, {65536}}

Usage

Configured via environment variables or command-line arguments to control the GEMM tuner's search behavior.

Code Reference

Source Location

Signature

struct TuningParams {
    int max_splits = 8;
    int max_waves  = 10;
    std::vector<int> swizzle{0, 3};
    float top_k    = 0;
    int   clusters = 5;
    int   min_iter = 1;
    int   max_iter = 10;
    float max_time = 1.f;
    std::vector<int> seq;
};

void ParseTuningParams(TuningParams& params, const std::string& str);
std::vector<int> ParseTuningSequence(const std::string& str);
std::vector<int> GenerateTuningSequence(const std::vector<std::array<int,3>>& generators);
std::vector<std::array<int,3>> GetDefaultTuningGenerators();

Import

#include "src/turbomind/kernels/gemm/tuner/params.h"

I/O Contract

Inputs

Name Type Required Description
str std::string Yes Comma-separated key=value tuning parameter string
generators vector<array<int,3>> For sequence gen Triplet generators (start, next, step)

Outputs

Name Type Description
params TuningParams& Populated tuning configuration
sequence vector<int> Generated batch-size tuning sequence

Usage Examples

TuningParams params;
ParseTuningParams(params, "max_splits=4,top_k=5,max_iter=20");

// Generate tuning sequence
auto seq = ParseTuningSequence("8-16-128,256-128-1024,4096");
// seq = {8, 16, 24, 32, ..., 128, 256, 384, ..., 1024, ..., 4096}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment