Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:InternLM Lmdeploy Core Array

From Leeroopedia


Knowledge Sources
Domains GPU_Kernels, Data_Structures
Last Updated 2026-02-07 15:00 GMT

Overview

Fixed-size array container for CUDA device code, providing an STL-like interface with specializations for sub-byte types (uint4_t and fp4_e2m1_t).

Description

The Array<T, N> template struct is a lightweight, fixed-size array designed for use in both host and device code within the TurboMind kernel library. It mirrors the interface of std::array with iterators, element access (operator[], front(), back()), and data pointer access. Two partial specializations handle sub-byte types: Array<uint4_t, N> and Array<fp4_e2m1_t, N>, which pack 8 elements per underlying storage unit using SubBytePtr for pointer access. Static assertions verify correct sizing for sub-byte specializations.

Usage

Use Array<T, N> as the fundamental register-resident data container in CUDA kernels. It serves as the building block for vectorized loads, stores, MMA fragment storage, and element-wise arithmetic throughout the TurboMind core kernel library.

Code Reference

Source Location

Signature

template<typename T, int N>
struct Array {
    T __a[N];
    TM_HOST_DEVICE constexpr reference operator[](size_type i) noexcept;
    TM_HOST_DEVICE constexpr pointer data() noexcept;
    TM_HOST_DEVICE constexpr iterator begin() noexcept;
    TM_HOST_DEVICE constexpr iterator end() noexcept;
    TM_HOST_DEVICE static constexpr std::integral_constant<int, N> size() noexcept;
};

// Sub-byte specializations
template<int N> struct Array<uint4_t, N>;
template<int N> struct Array<fp4_e2m1_t, N>;

Import

#include "src/turbomind/kernels/core/array.h"

I/O Contract

Inputs

Name Type Required Description
T typename Yes Element type (e.g., float, half, uint4_t, fp4_e2m1_t)
N int Yes Number of elements (must be > 0; must be multiple of 8 for sub-byte types)

Outputs

Name Type Description
Array<T,N> struct Fixed-size array with N elements of type T stored in registers

Usage Examples

// Declare and access a register array of 4 floats
Array<float, 4> frag;
frag[0] = 1.0f;
frag[1] = 2.0f;

// Sub-byte 4-bit quantized weights (16 x 4-bit packed into 8 bytes)
Array<uint4_t, 16> quant_weights;
auto ptr = quant_weights.data();  // returns SubBytePtr<uint4_t>

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment