Implementation:Triton inference server Server MemoryAllocTest
| Knowledge Sources | |
|---|---|
| Domains | Memory_Management, Testing |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Test executable for validating Triton's GPU and CPU memory allocation behavior during in-process inference.
Description
memory_alloc.cc is a standalone test executable that creates an in-process Triton server instance, loads a model, and runs inference requests with configurable input/output memory types (CPU, GPU, or specific GPU device). It implements custom response allocator callbacks for device memory allocation, validates outputs against expected values, and tests host policy configurations for multi-GPU setups.
Usage
Used as a QA test executable to verify that Triton correctly allocates and transfers data across CPU and GPU memory, especially in multi-GPU environments. Not a library for import.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: src/memory_alloc.cc
- Lines: 1-968
Signature
// Custom allocator callbacks
TRITONSERVER_Error* ResponseAlloc(
TRITONSERVER_ResponseAllocator* allocator,
const char* tensor_name, size_t byte_size,
TRITONSERVER_MemoryType preferred_memory_type,
int64_t preferred_memory_type_id,
void* userp, void** buffer,
void** buffer_userp,
TRITONSERVER_MemoryType* actual_memory_type,
int64_t* actual_memory_type_id);
TRITONSERVER_Error* ResponseRelease(
TRITONSERVER_ResponseAllocator* allocator,
void* buffer, void* buffer_userp,
size_t byte_size,
TRITONSERVER_MemoryType memory_type,
int64_t memory_type_id);
int main(int argc, char** argv);
Import
// Standalone executable - no import needed
// Build via CMakeLists.txt target
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| argv[1] | string | Yes | Path to model repository |
| argv[2] | string | Yes | Input memory type (system/pinned/gpu) |
| argv[3] | string | Yes | Output memory type (system/pinned/gpu) |
Outputs
| Name | Type | Description |
|---|---|---|
| exit code | int | 0 on success, non-zero on failure |
| stdout | text | Validation results and error messages |
Usage Examples
Running Memory Allocation Test
# Test with GPU input and CPU output
./memory_alloc /path/to/model_repository gpu system
# Test with pinned memory
./memory_alloc /path/to/model_repository pinned pinned