Implementation:Tencent Ncnn Vulkan Option And Allocator
| Knowledge Sources | |
|---|---|
| Domains | GPU_Computing, Memory_Management |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for configuring Vulkan GPU inference options and memory allocators provided by the ncnn library.
Description
The ncnn::Option class contains Vulkan-specific configuration flags accessed via net.opt. The key flag is use_vulkan_compute which enables GPU inference. Additional flags control precision (use_fp16_packed, use_fp16_storage, use_fp16_arithmetic), compute features (use_shader_local_memory, use_cooperative_matrix), and memory management (blob_vkallocator, staging_vkallocator).
VkBlobAllocator pools GPU memory in large blocks (default 16MB) for intermediate tensor storage. VkStagingAllocator manages host-visible memory for CPU↔GPU data transfer. VkWeightAllocator pools memory for model weights with smaller blocks (default 8MB).
Usage
Set opt.use_vulkan_compute = true on the Net object before loading the model. Optionally create and configure custom allocators for fine-grained memory control.
Code Reference
Source Location
- Repository: ncnn
- File: src/option.h (Option class), src/allocator.h (VkAllocator classes)
- Lines: option.h:L17-155 (Option class, use_vulkan_compute at L84), allocator.h:L263-295 (VkAllocator base), allocator.h:L298-319 (VkBlobAllocator), allocator.h:L372-397 (VkStagingAllocator)
Signature
namespace ncnn {
class Option
{
public:
Option();
// Enable Vulkan GPU compute
bool use_vulkan_compute;
// Precision options for GPU
bool use_fp16_packed; // fp16 packed storage
bool use_fp16_storage; // fp16 weight storage
bool use_fp16_arithmetic; // fp16 compute operations
// GPU compute optimizations
bool use_shader_local_memory; // local memory optimization
bool use_cooperative_matrix; // tensor core usage
// Vulkan memory allocators
VkAllocator* blob_vkallocator; // GPU blob memory
VkAllocator* workspace_vkallocator; // GPU workspace memory
VkAllocator* staging_vkallocator; // CPU-GPU staging memory
// Pipeline cache
PipelineCache* pipeline_cache;
};
class VkBlobAllocator : public VkAllocator
{
public:
explicit VkBlobAllocator(const VulkanDevice* vkdev,
size_t preferred_block_size = 16 * 1024 * 1024);
virtual void clear();
virtual VkBufferMemory* fastMalloc(size_t size);
virtual void fastFree(VkBufferMemory* ptr);
};
class VkStagingAllocator : public VkAllocator
{
public:
explicit VkStagingAllocator(const VulkanDevice* vkdev);
void set_size_compare_ratio(float scr);
virtual void clear();
virtual VkBufferMemory* fastMalloc(size_t size);
virtual void fastFree(VkBufferMemory* ptr);
};
} // namespace ncnn
Import
#include "net.h" // Option is included via net.h
#include "gpu.h" // VulkanDevice
#include "allocator.h" // VkBlobAllocator, VkStagingAllocator
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| use_vulkan_compute | bool | Yes | Enable GPU inference (set to true) |
| use_fp16_packed | bool | No | Enable fp16 packed element storage |
| use_fp16_storage | bool | No | Enable fp16 weight storage |
| use_fp16_arithmetic | bool | No | Enable fp16 compute |
| blob_vkallocator | VkAllocator* | No | Custom GPU memory allocator |
| staging_vkallocator | VkAllocator* | No | Custom staging allocator |
Outputs
| Name | Type | Description |
|---|---|---|
| net.opt | ncnn::Option | Configured option object controlling GPU inference behavior |
Usage Examples
Basic Vulkan Configuration
#include "net.h"
#include "gpu.h"
ncnn::create_gpu_instance();
ncnn::Net net;
net.opt.use_vulkan_compute = true;
net.set_vulkan_device(0);
// Load model (options must be set before this)
net.load_param("model.param");
net.load_model("model.bin");
Advanced Configuration with Custom Allocators
#include "net.h"
#include "gpu.h"
#include "allocator.h"
ncnn::create_gpu_instance();
const ncnn::VulkanDevice* vkdev = ncnn::get_gpu_device(0);
// Create custom allocators
ncnn::VkBlobAllocator blob_alloc(vkdev, 32 * 1024 * 1024); // 32MB blocks
ncnn::VkStagingAllocator staging_alloc(vkdev);
ncnn::Net net;
net.opt.use_vulkan_compute = true;
net.opt.use_fp16_storage = true;
net.opt.use_fp16_arithmetic = true;
net.opt.blob_vkallocator = &blob_alloc;
net.opt.staging_vkallocator = &staging_alloc;
net.set_vulkan_device(vkdev);
net.load_param("model.param");
net.load_model("model.bin");
// Cleanup
ncnn::destroy_gpu_instance();