Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:ClickHouse ClickHouse Fast Integer Formatting

From Leeroopedia


Knowledge Sources
Domains Formatting, Performance
Last Updated 2026-02-08 00:00 GMT

Overview

Replacing expensive division operations with fixed-point arithmetic and lookup tables to achieve multi-fold speedups in integer-to-string conversion.

Description

Fast integer formatting exploits the mathematical insight that dividing by powers of 10 can be approximated using fixed-point multiplication by pre-computed constants, followed by bit shifts. The jeaiii algorithm (by James Edward Anhalt III) implements this by processing two digits at a time, using multiplication by constants like `(10 × 2^24 / 1000 + 1)` to extract digit pairs without division. The algorithm also uses a pre-computed table that maps 0-99 to their two-character ASCII representations ("00", "01", ... "99"), enabling direct memory writes instead of character-by-character construction.

The key insight is that division is one of the slowest CPU operations (20-40 cycles on modern CPUs), while multiplication and shifting are fast (1-3 cycles). By carefully choosing fixed-point multipliers, we can replace division with these faster operations while maintaining exact results for decimal conversion.

Usage

Use fast integer formatting when converting many integers to strings, especially in data export, logging, JSON serialization, or display formatting. Choose this over `std::to_string` or `sprintf` for performance-critical paths. Use standard library functions when code simplicity matters more than performance or when formatting is not a bottleneck.

Theoretical Basis

Performance Comparison:

Standard library (`std::to_string`):

  • Uses division and modulo per digit: ~20-40 cycles each
  • For 5-digit number: ~100-200 cycles
  • Includes memory allocation overhead

Jeaiii algorithm:

  • Uses multiplication and shift: ~2-3 cycles per operation
  • Processes two digits at once
  • For 5-digit number: ~20-30 cycles
  • No allocation (writes to provided buffer)

Speedup: 3-10x depending on integer size and CPU architecture

Mathematical Foundation:

To extract digits without division, we use:

`digits = (n × multiplier) >> shift`

Example for dividing by 1000:

  • Exact division: `n / 1000`
  • Fixed-point: `(n × 10 × 2^24 / 1000) >> 24`
  • Since `2^24 / 100 ≈ 167772.16`, we use `1677722` as multiplier
  • Result is within rounding error for all valid inputs

The algorithm cascades these operations to extract digit pairs at different magnitudes (units, hundreds, ten-thousands, etc.).

Lookup Table Strategy:

Pre-computed "digits" table: ``` 00, 01, 02, ..., 09, 10, 11, 12, ..., 19, ... 90, 91, 92, ..., 99 ```

Cost: 200 bytes (100 pairs × 2 chars) Benefit: Direct two-byte memory write vs two branches + two additions

Algorithm Complexity:

  • Traditional: O(log₁₀(n)) divisions
  • Jeaiii: O(log₁₀₀(n)) multiply-shift operations = O(log₁₀(n) / 2)
  • Plus O(1) lookups for each digit pair

Optimization for Small Numbers:

The implementation adds a special case for 1-2 digit numbers:

  • Branch prediction is reliable (many database values are small)
  • Direct handling avoids table lookup overhead
  • Critical for performance on typical database workloads

Wide Integer Handling:

For 128-bit and 256-bit integers:

  • Break into 64-bit chunks
  • Process high-order chunks first
  • Use chunk processing by powers of 10^8 (largest power that fits in 64 bits)
  • Accumulate digit pairs into result buffer

Cache Efficiency:

The digit lookup table (200 bytes) fits in L1 cache. The sequential memory writes for output have good cache behavior. The algorithm has excellent instruction-level parallelism as operations are independent.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment