Implementation:Vllm project Vllm Scalar Type
| Knowledge Sources | |
|---|---|
| Domains | Quantization, Type_System, Core |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Defines the ScalarType class for representing a wide range of floating point and integer types, including sub-byte data types not supported by torch.dtype.
Description
The vllm::ScalarType class models numeric types by their exponent bits, mantissa bits, signedness, bias, and NaN representation. It provides static factory methods (int_, uint, float_IEEE754, float_) for creating various numeric formats and supports compile-time unique ID generation for C++17 template specialization. The header also defines a comprehensive set of predefined scalar type constants covering standard types (Half, BFloat16), quantized integer types (int4, uint4, uint4b8, uint8b128), and exotic floating point formats (FP4 e2m1, FP6 e3m2, FP8 e4m3fn, FP8 e5m2).
Usage
This header is included throughout the vLLM C++ codebase wherever quantized or custom numeric types are needed. It is the C++ counterpart of vllm/scalar_type.py and is used by quantization kernels, type dispatch logic, and model weight loading code.
Code Reference
Source Location
- Repository: vllm
- File: csrc/core/scalar_type.hpp
- Lines: 1-352
Signature
namespace vllm {
class ScalarType {
public:
enum NanRepr : uint8_t { NAN_NONE, NAN_IEEE_754, NAN_EXTD_RANGE_MAX_MIN };
constexpr ScalarType(uint8_t exponent, uint8_t mantissa, bool signed_,
int32_t bias, bool finite_values_only = false,
NanRepr nan_repr = NAN_IEEE_754);
static constexpr ScalarType int_(uint8_t size_bits, int32_t bias = 0);
static constexpr ScalarType uint(uint8_t size_bits, int32_t bias = 0);
static constexpr ScalarType float_IEEE754(uint8_t exponent, uint8_t mantissa);
static constexpr ScalarType float_(uint8_t exponent, uint8_t mantissa,
bool finite_values_only, NanRepr nan_repr);
constexpr Id id() const;
static constexpr ScalarType from_id(Id id);
constexpr int64_t size_bits() const;
constexpr bool is_signed() const;
constexpr bool is_integer() const;
constexpr bool is_floating_point() const;
constexpr bool is_ieee_754() const;
constexpr std::variant<int64_t, double> max() const;
constexpr std::variant<int64_t, double> min() const;
std::string str() const;
};
} // namespace vllm
Import
#include "core/scalar_type.hpp"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| exponent | uint8_t |
Yes | Number of exponent bits (0 for integer types) |
| mantissa | uint8_t |
Yes | Number of mantissa bits (or integer bits excluding sign) |
| signed_ | bool |
Yes | Whether the type supports negative numbers |
| bias | int32_t |
No | Stored value offset (value = stored - bias); used for quantized types |
| finite_values_only | bool |
No | If true, no +/-inf values exist |
| nan_repr | NanRepr |
No | How NaNs are encoded (NAN_NONE, NAN_IEEE_754, NAN_EXTD_RANGE_MAX_MIN) |
Outputs
| Name | Type | Description |
|---|---|---|
| ScalarType instance | ScalarType |
Immutable object representing the numeric type with query methods for size, range, and properties |
| id | int64_t |
Unique compile-time identifier for template specialization |
| str | std::string |
Human-readable string representation following ml_dtypes naming conventions |
Usage Examples
#include "core/scalar_type.hpp"
// Use predefined types
auto fp8_type = vllm::kFloat8_e4m3fn;
auto int4_type = vllm::kInt4;
// Query type properties
int bits = fp8_type.size_bits(); // 8
bool is_fp = fp8_type.is_floating_point(); // true
std::string name = fp8_type.str(); // "float8_e4m3fn"
// Create custom quantized integer type
auto custom_uint = vllm::ScalarType::uint(4, 8); // uint4 with bias=8
// Use id for template specialization
constexpr auto type_id = vllm::kFloat8_e4m3fn.id();