Implementation:InternLM Lmdeploy Gemm TunerParams
Appearance
| Knowledge Sources | |
|---|---|
| Domains | GPU_Kernels, GEMM |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Defines the TuningParams structure and parsing/generation functions for configuring the GEMM autotuner's search strategy, including split-K limits, swizzle factors, and hierarchical batch-size tuning sequences.
Description
This module (header + implementation) manages the tuning parameter space:
TuningParams struct:
max_splits: Maximum split-K factor (default 8)max_waves: Maximum wave count (default 10)swizzle: Allowed swizzle factors (default {0, 3})top_k: Top-K selection for hierarchical kernel samplingclusters: Number of kernel clusters (default 5)min_iter/max_iter: Iteration bounds for measurementmax_time: Time budget in seconds (default 1.0)seq: Batch-size tuning sequence
Parsing (params.cc):
ParseTuningParams: Parses a comma-separated key=value string (e.g.,"max_splits=8,top_k=10")ParseTuningSequence: Parses a tuning sequence from triplet generators (e.g.,"16-16-128,256-128-1024,8192") where each tripletstart-next-stepdefines a geometric progressionGenerateTuningSequence: Expands generator triplets into a flat sequence of batch sizes, with a sentinel max valueGetDefaultTuningGenerators: Returns default generators{{8,16,8}, {16,64,16}, {65536}}
Usage
Configured via environment variables or command-line arguments to control the GEMM tuner's search behavior.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File: src/turbomind/kernels/gemm/tuner/params.h
- File: src/turbomind/kernels/gemm/tuner/params.cc
Signature
struct TuningParams {
int max_splits = 8;
int max_waves = 10;
std::vector<int> swizzle{0, 3};
float top_k = 0;
int clusters = 5;
int min_iter = 1;
int max_iter = 10;
float max_time = 1.f;
std::vector<int> seq;
};
void ParseTuningParams(TuningParams& params, const std::string& str);
std::vector<int> ParseTuningSequence(const std::string& str);
std::vector<int> GenerateTuningSequence(const std::vector<std::array<int,3>>& generators);
std::vector<std::array<int,3>> GetDefaultTuningGenerators();
Import
#include "src/turbomind/kernels/gemm/tuner/params.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| str | std::string | Yes | Comma-separated key=value tuning parameter string |
| generators | vector<array<int,3>> | For sequence gen | Triplet generators (start, next, step) |
Outputs
| Name | Type | Description |
|---|---|---|
| params | TuningParams& | Populated tuning configuration |
| sequence | vector<int> | Generated batch-size tuning sequence |
Usage Examples
TuningParams params;
ParseTuningParams(params, "max_splits=4,top_k=5,max_iter=20");
// Generate tuning sequence
auto seq = ParseTuningSequence("8-16-128,256-128-1024,4096");
// seq = {8, 16, 24, 32, ..., 128, 256, 384, ..., 1024, ..., 4096}
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment