Principle:Tensorflow Serving Request Logging

Knowledge Sources	Tensorflow_Serving
Domains	Model Serving, Request Logging
Last Updated	2026-02-13 00:00 GMT

Overview

The Request Logging principle defines a layered, pluggable architecture for sampling, formatting, and persisting request/response logs for both unary and streaming serving APIs.

Description

TensorFlow Serving's request logging system is organized in four layers:

LogCollector (Storage Layer): An abstract interface for log persistence backends. Implementations are registered via a factory pattern with string type identifiers. The REGISTER_LOG_COLLECTOR macro enables static registration at program startup. LogCollectors accept protobuf messages and provide flush semantics for crash resilience.

RequestLogger (Formatting + Sampling Layer): An abstract class that combines sampling decisions with log message formatting. It uses a UniformSampler to decide whether to log each request based on the configured sampling rate. Subclasses implement CreateLogMessage() to produce protocol-specific log protobufs. The class uses shared_from_this() to enable safe weak references from streaming callbacks.

StreamLogger (Streaming Layer): A template class for logging multi-turn streaming interactions. It accumulates request/response messages and supports multiple log callback sinks. The callbacks use weak_ptr to the RequestLogger, enabling graceful degradation if the logger is reconfigured during an active stream.

ServerRequestLogger (Orchestration Layer): The server-wide coordinator that manages per-model logging configurations. It maps model names to RequestLogger instances, supports dynamic configuration updates, deduplicates loggers for models with identical configs, and uses FastReadDynamicPtr for lock-free reads on the hot path.

Usage

Apply this principle when implementing or extending the request logging pipeline. Implement LogCollector for new storage backends. Subclass RequestLogger for new API-specific log formats. Use ServerRequestLogger at the server level to manage per-model configuration.

Theoretical Basis

The logging architecture follows a layered pipeline pattern with pluggable components:

Request arrives
  |
  v
ServerRequestLogger
  |-- Lookup model -> [RequestLogger_1, RequestLogger_2, ...]
  |
  v (for each logger)
RequestLogger
  |-- Sample(rate) -> skip if not sampled
  |-- CreateLogMessage(request, response, metadata) -> log_proto
  |-- LogCollector.CollectMessage(log_proto)
  |
  v (for streaming)
StreamLogger
  |-- Accumulate requests/responses
  |-- On stream end: CreateLogMessage -> callback(log_proto)
  |   (callback holds weak_ptr<RequestLogger>)

Configuration update flow:
  New config map -> ServerRequestLogger.Update()
    |-- For each (model, config):
    |     FindOrCreateLogger(config)  // dedup by config
    |-- Atomically swap model_to_loggers_map (FastReadDynamicPtr)

Key design properties:

Sampling at the boundary: Sampling decisions are made early (before log formatting), minimizing overhead for unsampled requests.
Factory registration: LogCollectors use a type-string registry, enabling new backends without modifying existing code.
Config deduplication: Models sharing the same logging config share a single RequestLogger instance, reducing memory and connection overhead.
Weak reference for streaming: StreamLogger callbacks hold weak_ptr<RequestLogger> instead of shared_ptr, allowing logger reconfiguration without waiting for active streams to complete.
Lock-free hot path: FastReadDynamicPtr enables the per-request logger lookup to be lock-free, with locking only during configuration updates.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment