Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:InternLM Lmdeploy Gemm DispatchCache

From Leeroopedia


Knowledge Sources
Domains GPU_Kernels, GEMM
Last Updated 2026-02-07 15:00 GMT

Overview

Implements a cache that maps GEMM problem descriptors (GemmDesc) to previously-tuned kernel launch specifications (LaunchSpec), with serialization support for persisting tuning results.

Description

The DispatchCache uses a PIMPL (pointer-to-implementation) pattern to store a sorted mapping from GemmDesc to LaunchSpec:

  • Find: Exact match lookup for a given GemmDesc
  • LowerBound: Finds the closest match with dimensions less than or equal to the query (used for interpolation when an exact size hasn't been tuned)
  • Insert: Adds a new tuning result to the cache
  • Export: Serializes the cache to an output stream for persistence
  • Import: Deserializes from an input stream, resolving kernel pointers from the provided kernel list

The cache is constructed with the available kernel list to enable pointer resolution during import.

Usage

Used by the Gemm class to avoid re-measuring kernels for previously-seen problem sizes. Tuning results persist across sessions via Export/Import.

Code Reference

Source Location

Signature

class DispatchCache {
public:
    DispatchCache(std::vector<Kernel*> kernels);
    ~DispatchCache();

    std::optional<LaunchSpec> LowerBound(const GemmDesc& desc) const;
    std::optional<LaunchSpec> Find(const GemmDesc& desc) const;
    bool Insert(const GemmDesc& desc, const LaunchSpec& spec);
    int Export(std::ostream& os) const;
    int Import(std::istream& is);
};

Import

#include "src/turbomind/kernels/gemm/dispatch_cache.h"

I/O Contract

Inputs

Name Type Required Description
kernels vector<Kernel*> Yes Available kernels for pointer resolution
desc GemmDesc Yes Problem descriptor to look up or insert
spec LaunchSpec For Insert Tuned launch configuration

Outputs

Name Type Description
LaunchSpec optional Cached launch specification (if found)
Export/Import int Number of entries serialized/deserialized

Usage Examples

DispatchCache cache(kernels);
cache.Import(file_stream);  // Load previous tuning results
if (auto spec = cache.Find(desc)) {
    // Use cached kernel
} else {
    // Tune and insert
    cache.Insert(desc, measured_spec);
}
cache.Export(file_stream);  // Persist results

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment