Implementation:Mlc ai Mlc llm Dynamic Bitset
Overview
The DynamicBitset header at cpp/support/dynamic_bitset.h provides a runtime-sized bitset implementation used primarily in grammar-guided generation within MLC LLM. Unlike std::bitset, whose size must be known at compile time, DynamicBitset accepts its size at construction, making it suitable for scenarios where the number of bits depends on runtime parameters such as vocabulary size.
Purpose
The DynamicBitset class is designed for grammar-guided generation utilities, where it is used to track which tokens are valid at each step of constrained decoding. The class supports two memory management modes:
- Internal buffer mode: When constructed without an external data pointer, it allocates and manages its own
std::vector<uint32_t>buffer. - External buffer mode: When constructed with a pre-allocated
uint32_t*pointer, it uses that external buffer directly without owning it.
Class Interface
Static Utility
static int CalculateBufferSize(int element_size) { return (element_size + 31) / 32; }
Computes the minimum number of uint32_t words required to store element_size bits. This rounds up using integer arithmetic: (element_size + 31) / 32.
Constructors
DynamicBitset(); // Empty bitset, must be assigned before use
DynamicBitset(int size, uint32_t* data = nullptr); // Sized bitset with optional external buffer
The default constructor creates an empty bitset. The parameterized constructor accepts a bit count and an optional external buffer pointer. If data is nullptr, an internal std::vector<uint32_t> is allocated and zero-initialized.
Assignment Operators
The class provides both copy and move assignment operators:
- Copy assignment: Copies the data from the source bitset. If the destination uses an external buffer, it asserts that the destination is large enough to hold the source data (expansion is not permitted for externally-managed memory).
- Move assignment: Transfers ownership of the internal buffer or the external pointer, depending on the source's mode.
DynamicBitset& operator=(const DynamicBitset& other);
DynamicBitset& operator=(DynamicBitset&& other);
Bit Access and Manipulation
| Method | Description |
|---|---|
operator[](int index) |
Returns the boolean value of the bit at position index. Uses bit shifting: (data_[index / 32] >> (index % 32)) & 1.
|
Size() |
Returns the total number of bits in the bitset. |
Set() |
Sets all bits to 1 by filling the buffer with 0xFF bytes.
|
Set(int index, bool value) |
Sets the bit at index to the given value. Uses bitwise OR to set and bitwise AND with complement to clear.
|
Reset() |
Clears all bits to 0 by zeroing the buffer. |
Reset(int index) |
Clears the bit at index (delegates to Set(index, false)).
|
Bitwise Operations
DynamicBitset& operator|=(const DynamicBitset& other);
Performs an element-wise bitwise OR between this and other, storing the result in this. This is useful for combining allowable token sets in grammar-guided generation.
Internal Data Layout
The bitset stores bits in a uint32_t array where bit i occupies position i % 32 in word i / 32. The private members are:
| Member | Type | Description |
|---|---|---|
size_ |
int |
Number of bits in the bitset |
buffer_size_ |
int |
Number of uint32_t words in the buffer
|
data_ |
uint32_t* |
Pointer to the active buffer (internal or external) |
internal_buffer_ |
std::vector<uint32_t> |
Internally-managed storage (empty when external buffer is used) |
is_internal_ |
bool |
Flag indicating whether the buffer is internally managed |
Dependencies
tvm/runtime/logging.h-- ProvidesDCHECKmacros for debug-mode assertions.<cstdint>,<cstring>,<vector>-- Standard library headers for integer types,memset/memcpy, and dynamic array storage.
File Location
- Source file:
cpp/support/dynamic_bitset.h - Namespace:
mlc::llm - Header guard:
MLC_LLM_SUPPORT_DYNAMIC_BITSET_H_