Principle:Tensorflow Serving Request Logging
| Knowledge Sources | |
|---|---|
| Domains | Model Serving, Request Logging |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
The Request Logging principle defines a layered, pluggable architecture for sampling, formatting, and persisting request/response logs for both unary and streaming serving APIs.
Description
TensorFlow Serving's request logging system is organized in four layers:
LogCollector (Storage Layer): An abstract interface for log persistence backends. Implementations are registered via a factory pattern with string type identifiers. The REGISTER_LOG_COLLECTOR macro enables static registration at program startup. LogCollectors accept protobuf messages and provide flush semantics for crash resilience.
RequestLogger (Formatting + Sampling Layer): An abstract class that combines sampling decisions with log message formatting. It uses a UniformSampler to decide whether to log each request based on the configured sampling rate. Subclasses implement CreateLogMessage() to produce protocol-specific log protobufs. The class uses shared_from_this() to enable safe weak references from streaming callbacks.
StreamLogger (Streaming Layer): A template class for logging multi-turn streaming interactions. It accumulates request/response messages and supports multiple log callback sinks. The callbacks use weak_ptr to the RequestLogger, enabling graceful degradation if the logger is reconfigured during an active stream.
ServerRequestLogger (Orchestration Layer): The server-wide coordinator that manages per-model logging configurations. It maps model names to RequestLogger instances, supports dynamic configuration updates, deduplicates loggers for models with identical configs, and uses FastReadDynamicPtr for lock-free reads on the hot path.
Usage
Apply this principle when implementing or extending the request logging pipeline. Implement LogCollector for new storage backends. Subclass RequestLogger for new API-specific log formats. Use ServerRequestLogger at the server level to manage per-model configuration.
Theoretical Basis
The logging architecture follows a layered pipeline pattern with pluggable components:
Request arrives
|
v
ServerRequestLogger
|-- Lookup model -> [RequestLogger_1, RequestLogger_2, ...]
|
v (for each logger)
RequestLogger
|-- Sample(rate) -> skip if not sampled
|-- CreateLogMessage(request, response, metadata) -> log_proto
|-- LogCollector.CollectMessage(log_proto)
|
v (for streaming)
StreamLogger
|-- Accumulate requests/responses
|-- On stream end: CreateLogMessage -> callback(log_proto)
| (callback holds weak_ptr<RequestLogger>)
Configuration update flow:
New config map -> ServerRequestLogger.Update()
|-- For each (model, config):
| FindOrCreateLogger(config) // dedup by config
|-- Atomically swap model_to_loggers_map (FastReadDynamicPtr)
Key design properties:
- Sampling at the boundary: Sampling decisions are made early (before log formatting), minimizing overhead for unsampled requests.
- Factory registration: LogCollectors use a type-string registry, enabling new backends without modifying existing code.
- Config deduplication: Models sharing the same logging config share a single RequestLogger instance, reducing memory and connection overhead.
- Weak reference for streaming: StreamLogger callbacks hold
weak_ptr<RequestLogger>instead ofshared_ptr, allowing logger reconfiguration without waiting for active streams to complete. - Lock-free hot path:
FastReadDynamicPtrenables the per-request logger lookup to be lock-free, with locking only during configuration updates.