Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Helicone Helicone Central Registry Indexing

From Leeroopedia
Revision as of 17:22, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Helicone_Helicone_Central_Registry_Indexing.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Model Registry, Indexing, Performance Optimization
Last Updated 2026-02-14 00:00 GMT

Overview

Central registry indexing is the technique of pre-computing multiple lookup maps from a flat collection of model-provider configuration records, enabling O(1) access by model name, provider, endpoint ID, or composite key at runtime.

Description

An LLM gateway handles many models across many providers, each with multiple regional deployments. At request time, the gateway needs to answer questions like "which providers serve model X?", "what are the cheapest endpoints for model Y?", "what is the configuration for this specific model:provider:region triple?", or "which models does provider Z offer?". If these questions were answered by scanning the full configuration set on every request, the latency would be unacceptable for a high-throughput proxy.

Central Registry Indexing addresses this by building a set of pre-computed Map objects at startup time. Each Map provides O(1) access for a specific query pattern. The build process iterates over all model-provider configurations once, populating every index in a single pass. The resulting index set is immutable after construction and can be safely shared across all request handlers.

A key design choice is that endpoints within each index are sorted by cost (ascending sum of input and output token rates). This means the first endpoint in any list is always the cheapest option, which is useful for cost-optimized routing without runtime sorting.

Usage

Use central registry indexing when:

  • The gateway starts up and needs to prepare lookup tables for request routing
  • You need to find all providers offering a particular model
  • You need to resolve a specific model:provider:region endpoint configuration
  • You need to identify all PTB-enabled (Provider Token Billing) endpoints for a model
  • You need to build a cost-sorted list of available endpoints for routing decisions

Theoretical Basis

This principle applies Materialized View optimization from database systems. Rather than computing query results on demand (the equivalent of a table scan), the system pre-computes and stores the results of common query patterns (the equivalent of materialized indexes). The trade-off is increased memory usage and startup time in exchange for O(1) runtime lookups.

The build process implements a Single-Pass Multi-Index Construction algorithm:

function buildIndexes(configs):
    indexes = initializeEmptyMaps()

    for each (compositeKey, config) in configs:
        (modelName, provider) = split(compositeKey, ":")

        indexes.configById[compositeKey] = config
        indexes.providerToModels[provider].add(modelName)
        indexes.modelToProviders[modelName].add(provider)
        indexes.modelToConfigs[modelName].append(config)

        for each (deploymentId, deploymentConfig) in config.endpoints:
            endpoint = mergeConfigs(config, deploymentConfig, deploymentId)
            endpointKey = compositeKey + ":" + deploymentId
            indexes.endpointById[endpointKey] = endpoint
            indexes.modelToEndpoints[modelName].append(endpoint)

            if endpoint.ptbEnabled:
                indexes.modelToPtbEndpoints[modelName].append(endpoint)

    sortAllEndpointListsByCost(indexes)
    return indexes

The cost-based sorting applies a Greedy Optimization heuristic: by keeping endpoints sorted by cost, consumers can always pick the first available endpoint for the cheapest option without additional computation.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment