Implementation:InternLM Lmdeploy Buffer
| Knowledge Sources | |
|---|---|
| Domains | Memory_Management, Core_Infrastructure |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Provides a reference-counted, type-aware flat memory buffer abstraction with support for shared ownership, slicing, type-view conversion, and CUDA memory copy/clear operations.
Description
The Buffer class is a 1D memory container that pairs a shared_ptr<void> data pointer with metadata: element count (size_), base offset (base_), data type (dtype_), and device location (device_). It supports multiple construction modes: empty, typed-reference (non-owning), shared-ownership, and allocator-backed. The view(dtype) method reinterprets the buffer as a different data type, adjusting element count and base offset accordingly. The slice(base, size) method creates a sub-buffer sharing the same underlying memory. borrow() creates a non-owning reference. The typed subclass Buffer_<T> provides compile-time type safety with begin()/end() iterators, operator[], and bounds-checked at(). Free functions Copy() and Clear() perform cudaMemcpyAsync and cudaMemsetAsync operations. Serialization support is provided through save/load template functions.
Usage
Used as the primary flat memory abstraction in TurboMind. Buffers are the storage backing for Tensor objects and are used directly for 1D data such as token IDs, attention masks, and intermediate results.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File (header): src/turbomind/core/buffer.h
- File (impl): src/turbomind/core/buffer.cc
- Lines: buffer.h 1-390, buffer.cc 1-98
Signature
namespace turbomind::core {
class Buffer {
public:
Buffer();
explicit Buffer(DataType dtype);
template<class T> Buffer(T* data, ssize_t size, Device device);
Buffer(void* data, ssize_t size, DataType dtype, Device device);
Buffer(shared_ptr<void> data, ssize_t size, DataType dtype, Device device);
Buffer(ssize_t size, DataType dtype, Allocator& alloc);
Buffer(ssize_t size, DataType dtype, Device device);
template<class T> T* data();
template<class T> const T* data() const;
void* raw_data(ssize_t offset = 0);
DataType dtype() const;
Device device() const;
ssize_t size() const;
ssize_t byte_size() const;
explicit operator bool() const noexcept;
Buffer view(DataType dtype) const;
Buffer slice(ssize_t base, ssize_t size) const;
Buffer borrow() const;
};
template<class T>
struct Buffer_ : public Buffer { /* typed wrapper with iterators */ };
void Copy(const Buffer& a, ssize_t n, Ref<Buffer> b_, const Stream& stream);
void Copy(const Buffer& a, Ref<Buffer> b_);
void Clear(Ref<Buffer> b_, const Stream& stream);
void Clear(Ref<Buffer> b_);
Buffer empty_like(const Buffer& buffer);
Buffer empty_like(const Buffer& buffer, Device device);
Buffer empty_like(const Buffer& buffer, DataType dtype);
} // namespace turbomind::core
Import
#include "src/turbomind/core/buffer.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| size | ssize_t | Yes | Number of elements in the buffer |
| dtype | DataType | Yes | Element data type |
| device | Device | Conditional | Device location (CPU, pinned, CUDA) |
| alloc | Allocator& | Conditional | Allocator to use for memory; alternative to device |
| data | void* or T* | Conditional | Pre-existing data pointer for reference-mode construction |
Outputs
| Name | Type | Description |
|---|---|---|
| data() | T* or void* | Pointer to the buffer's data, optionally typed |
| size() | ssize_t | Number of elements |
| byte_size() | ssize_t | Total bytes occupied |
| view() | Buffer | A reinterpreted-type view of the same memory |
| slice() | Buffer | A sub-range of the same memory |
Usage Examples
#include "src/turbomind/core/buffer.h"
using namespace turbomind::core;
// Allocate a device buffer of 1024 float32 elements
Buffer buf(1024, kFloat32, kDEVICE);
// Access raw data pointer
void* ptr = buf.raw_data();
// Create a typed buffer
Buffer_<float> typed_buf(1024, kDEVICE);
float* fptr = typed_buf.data();
// Slice the first 256 elements
Buffer sub = buf.slice(0, 256);
// Copy between buffers
Buffer dst(1024, kFloat32, kDEVICE);
Copy(buf, dst);
// View as half precision (reinterpret element count)
Buffer half_view = buf.view(kFloat16);