Implementation:LMCache LMCache Mem Alloc
| Knowledge Sources | |
|---|---|
| Domains | Memory Management, CUDA, NUMA |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Provides low-level C++ functions for allocating and freeing CUDA-pinned and NUMA-aware host memory.
Description
This module implements memory allocation routines used by LMCache for high-performance data transfers between host and GPU. It supports three allocation strategies: CUDA pinned memory via cudaHostAlloc, NUMA-bound memory via mmap with mbind, and a combined pinned-NUMA allocation that maps memory to a specific NUMA node and then registers it with CUDA for DMA access. Each allocation function returns a uintptr_t pointer, and corresponding free functions handle proper cleanup including CUDA unregistration and munmap.
Usage
Use these functions when LMCache needs to allocate host memory that participates in GPU DMA transfers, especially in multi-socket NUMA systems where memory locality affects transfer bandwidth. The pinned-NUMA variant is particularly useful for ensuring that host buffers reside on the NUMA node closest to the target GPU.
Code Reference
Source Location
- Repository: LMCache
- File: csrc/mem_alloc.cpp
- Lines: 1-96
Signature
uintptr_t alloc_pinned_ptr(size_t size, unsigned int flags);
void free_pinned_ptr(uintptr_t ptr);
uintptr_t alloc_numa_ptr(size_t size, int node);
void free_numa_ptr(uintptr_t ptr, size_t size);
uintptr_t alloc_pinned_numa_ptr(size_t size, int node);
void free_pinned_numa_ptr(uintptr_t ptr, size_t size);
Import
#include "mem_alloc.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| size | size_t | Yes | Number of bytes to allocate |
| flags | unsigned int | Yes (alloc_pinned_ptr) | Flags passed to cudaHostAlloc (e.g., cudaHostAllocDefault) |
| node | int | Yes (NUMA variants) | NUMA node index to bind memory to |
| ptr | uintptr_t | Yes (free functions) | Pointer returned by the corresponding alloc function |
Outputs
| Name | Type | Description |
|---|---|---|
| ptr | uintptr_t | Integer representation of the allocated memory pointer (from alloc functions) |
| (void) | void | Free functions return nothing; throw std::runtime_error on failure |
Usage Examples
// Allocate 1 GB of CUDA-pinned memory
uintptr_t pinned = alloc_pinned_ptr(1UL << 30, cudaHostAllocDefault);
// ... use pinned memory for GPU transfers ...
free_pinned_ptr(pinned);
// Allocate 512 MB on NUMA node 0 with CUDA pinning
uintptr_t numa_pinned = alloc_pinned_numa_ptr(512UL << 20, 0);
// ... use for DMA transfers from GPU nearest to NUMA node 0 ...
free_pinned_numa_ptr(numa_pinned, 512UL << 20);