Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server MemoryAllocTest

From Leeroopedia
Knowledge Sources
Domains Memory_Management, Testing
Last Updated 2026-02-13 17:00 GMT

Overview

Test executable for validating Triton's GPU and CPU memory allocation behavior during in-process inference.

Description

memory_alloc.cc is a standalone test executable that creates an in-process Triton server instance, loads a model, and runs inference requests with configurable input/output memory types (CPU, GPU, or specific GPU device). It implements custom response allocator callbacks for device memory allocation, validates outputs against expected values, and tests host policy configurations for multi-GPU setups.

Usage

Used as a QA test executable to verify that Triton correctly allocates and transfers data across CPU and GPU memory, especially in multi-GPU environments. Not a library for import.

Code Reference

Source Location

Signature

// Custom allocator callbacks
TRITONSERVER_Error* ResponseAlloc(
    TRITONSERVER_ResponseAllocator* allocator,
    const char* tensor_name, size_t byte_size,
    TRITONSERVER_MemoryType preferred_memory_type,
    int64_t preferred_memory_type_id,
    void* userp, void** buffer,
    void** buffer_userp,
    TRITONSERVER_MemoryType* actual_memory_type,
    int64_t* actual_memory_type_id);

TRITONSERVER_Error* ResponseRelease(
    TRITONSERVER_ResponseAllocator* allocator,
    void* buffer, void* buffer_userp,
    size_t byte_size,
    TRITONSERVER_MemoryType memory_type,
    int64_t memory_type_id);

int main(int argc, char** argv);

Import

// Standalone executable - no import needed
// Build via CMakeLists.txt target

I/O Contract

Inputs

Name Type Required Description
argv[1] string Yes Path to model repository
argv[2] string Yes Input memory type (system/pinned/gpu)
argv[3] string Yes Output memory type (system/pinned/gpu)

Outputs

Name Type Description
exit code int 0 on success, non-zero on failure
stdout text Validation results and error messages

Usage Examples

Running Memory Allocation Test

# Test with GPU input and CPU output
./memory_alloc /path/to/model_repository gpu system

# Test with pinned memory
./memory_alloc /path/to/model_repository pinned pinned

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment