Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Duckdb Duckdb SkipList

From Leeroopedia
Revision as of 14:51, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Duckdb_Duckdb_SkipList.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Data_Structures, Third_Party
Last Updated 2026-02-07 12:00 GMT

Overview

SkipList is a generic ordered data structure by Paul Ross that maintains elements in sorted order with O(log n) insert, remove, and indexed access, used by DuckDB to implement efficient rolling median and other order-statistic operations.

Description

The implementation resides in the OrderedStructs::SkipList namespace and consists of several template classes:

  • HeadNode<T, _Compare> -- The main skip list container and entry point. It manages a probabilistic multi-level linked list where level 0 contains all nodes, and each higher level ideally spans every 2^n nodes. The HeadNode provides:
    • insert(const T &value) -- Insert a value, maintaining sorted order
    • remove(const T &value) -- Remove and return a value
    • has(const T &value) -- Check for existence
    • at(size_t index) -- Indexed access in O(log n) time via width-based traversal
    • index(const T &value) -- Find the index of a value
    • size() -- Return the number of elements
  • Node<T, _Compare> -- Represents an individual element in the skip list. Each node has a value and a SwappableNodeRefStack of forward references to other nodes at various levels.
  • SwappableNodeRefStack<T, _Compare> / NodeRef<T, _Compare> -- Manages the per-node stack of forward references. Each NodeRef holds a pointer to the next node at that level plus a width (the number of level-0 nodes spanned), enabling efficient O(log n) indexed access by accumulating widths during traversal.
  • RollingMedian -- A utility built on top of the skip list that computes rolling medians over a sliding window. It provides dest_size() and dest_count() helpers for sizing the output buffer and error-checked computation via RollingMedianResult codes.

Node heights are determined probabilistically at insertion time by a virtual coin toss, creating the multi-level structure without requiring explicit rebalancing.

Usage

DuckDB uses the SkipList primarily for window function computations that require order-statistic queries, particularly the MEDIAN window function and other quantile-based operations over sliding windows. The O(log n) indexed access makes it efficient for repeatedly finding the middle element as the window slides.

Code Reference

Source Location

Signature

namespace OrderedStructs {
namespace SkipList {

template <typename T, typename _Compare = std::less<T>>
class HeadNode {
public:
    HeadNode(_Compare cmp = _Compare());
    virtual ~HeadNode();

    // Lookup
    bool has(const T &value) const;
    const T &at(size_t index) const;
    void at(size_t index, size_t count, std::vector<T> &dest) const;
    size_t index(const T &value) const;
    size_t size() const;

    // Modification
    void insert(const T &value);
    T remove(const T &value);

    // Inspection
    size_t height() const;
    size_t height(size_t idx) const;
    size_t width(size_t idx, size_t level) const;
    IntegrityCheck lacksIntegrity() const;
    size_t size_of() const;
};

} // namespace SkipList

namespace RollingMedian {

enum RollingMedianResult {
    ROLLING_MEDIAN_SUCCESS = 0,
    ROLLING_MEDIAN_SOURCE_STRIDE,
    ROLLING_MEDIAN_DESTINATION_STRIDE,
    ROLLING_MEDIAN_WIN_LENGTH,
};

// Helpers
size_t dest_size(size_t count, size_t win_length, size_t dest_stride);
size_t dest_count(size_t count, size_t win_length);

} // namespace RollingMedian
} // namespace OrderedStructs

Import

#include "skiplist/SkipList.h"

I/O Contract

Inputs

Name Type Required Description
value const T& Yes (insert/remove/has) The value to insert, remove, or search for
index size_t Yes (at) Zero-based index for positional access into the sorted sequence
count size_t Yes (at range) Number of consecutive values to retrieve starting from index
cmp _Compare No Comparison function for ordering; defaults to std::less<T>

Outputs

Name Type Description
return (has) bool True if the value exists in the skip list
return (at) const T& The value at the given sorted index; throws IndexError if out of range
return (remove) T The removed value; throws ValueError if not present
return (index) size_t Index of the first occurrence; throws ValueError if not found
return (size) size_t Number of elements currently in the skip list

Usage Examples

#include "skiplist/SkipList.h"

// Create a skip list of doubles
OrderedStructs::SkipList::HeadNode<double> sl;

// Insert values
sl.insert(42.0);
sl.insert(21.0);
sl.insert(84.0);
sl.insert(63.0);

// Size and indexed access
size_t n = sl.size();            // 4
double median = sl.at(n / 2);   // O(log n) access to the median element

// Check existence and find index
bool found = sl.has(42.0);      // true
size_t idx = sl.index(42.0);    // 1 (sorted position)

// Remove a value
double removed = sl.remove(21.0);

// Rolling median example: maintain a sliding window
OrderedStructs::SkipList::HeadNode<double> window;
// Add elements to the window
window.insert(5.0);
window.insert(3.0);
window.insert(8.0);
// Get median
double med = window.at(window.size() / 2);
// Slide: remove oldest, add newest
window.remove(5.0);
window.insert(1.0);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment