Implementation:InternLM Lmdeploy Gemm DispatchCache
Appearance
| Knowledge Sources | |
|---|---|
| Domains | GPU_Kernels, GEMM |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Implements a cache that maps GEMM problem descriptors (GemmDesc) to previously-tuned kernel launch specifications (LaunchSpec), with serialization support for persisting tuning results.
Description
The DispatchCache uses a PIMPL (pointer-to-implementation) pattern to store a sorted mapping from GemmDesc to LaunchSpec:
Find: Exact match lookup for a givenGemmDescLowerBound: Finds the closest match with dimensions less than or equal to the query (used for interpolation when an exact size hasn't been tuned)Insert: Adds a new tuning result to the cacheExport: Serializes the cache to an output stream for persistenceImport: Deserializes from an input stream, resolving kernel pointers from the provided kernel list
The cache is constructed with the available kernel list to enable pointer resolution during import.
Usage
Used by the Gemm class to avoid re-measuring kernels for previously-seen problem sizes. Tuning results persist across sessions via Export/Import.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File: src/turbomind/kernels/gemm/dispatch_cache.h
Signature
class DispatchCache {
public:
DispatchCache(std::vector<Kernel*> kernels);
~DispatchCache();
std::optional<LaunchSpec> LowerBound(const GemmDesc& desc) const;
std::optional<LaunchSpec> Find(const GemmDesc& desc) const;
bool Insert(const GemmDesc& desc, const LaunchSpec& spec);
int Export(std::ostream& os) const;
int Import(std::istream& is);
};
Import
#include "src/turbomind/kernels/gemm/dispatch_cache.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| kernels | vector<Kernel*> | Yes | Available kernels for pointer resolution |
| desc | GemmDesc | Yes | Problem descriptor to look up or insert |
| spec | LaunchSpec | For Insert | Tuned launch configuration |
Outputs
| Name | Type | Description |
|---|---|---|
| LaunchSpec | optional | Cached launch specification (if found) |
| Export/Import | int | Number of entries serialized/deserialized |
Usage Examples
DispatchCache cache(kernels);
cache.Import(file_stream); // Load previous tuning results
if (auto spec = cache.Find(desc)) {
// Use cached kernel
} else {
// Tune and insert
cache.Insert(desc, measured_spec);
}
cache.Export(file_stream); // Persist results
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment