Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Tensorflow Serving RunSavedModelWarmup

From Leeroopedia
Knowledge Sources
Domains Performance, Deployment
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete tool for executing warmup inference requests during model loading to reduce first-request latency, provided by the saved_model_warmup module.

Description

RunSavedModelWarmup() reads the warmup TFRecord file from the SavedModel's assets.extra/ directory and executes each request against the loaded session:

  1. Reads up to WarmupConsts::kMaxNumRecords (1000) records from the TFRecord file
  2. Parses each record as a PredictionLog proto
  3. Dispatches to RunWarmupRequest() which handles Predict, Classify, Regress, and MultiInference log types
  4. Optionally runs with multiple threads (num_model_warmup_threads)
  5. Optionally warms up at all allowed batch sizes (enable_all_batch_sizes_warmup)

Returns OK status even if no warmup file is found (warmup is optional).

Usage

Called automatically during model loading when --enable_model_warmup=true (the default). Users need only provide the warmup file in their SavedModel export.

Code Reference

Source Location

  • Repository: tensorflow/serving
  • File: tensorflow_serving/servables/tensorflow/saved_model_warmup.cc L82-91
  • Internal: tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc L71-255
  • Dispatcher: tensorflow_serving/servables/tensorflow/saved_model_warmup.cc L39-78
  • Header: tensorflow_serving/servables/tensorflow/saved_model_warmup.h L34-36

Signature

Status RunSavedModelWarmup(
    const ModelWarmupOptions& model_warmup_options,
    const RunOptions& run_options,
    const string& export_dir,
    SavedModelBundle* bundle
);

Import

#include "tensorflow_serving/servables/tensorflow/saved_model_warmup.h"

I/O Contract

Inputs

Name Type Required Description
model_warmup_options ModelWarmupOptions Yes Warmup configuration (iterations, threads, batch sizes)
run_options RunOptions Yes TensorFlow session run options
export_dir string Yes Path to SavedModel directory
bundle SavedModelBundle* Yes Loaded model with session and metagraph

Outputs

Name Type Description
Status Status OK on success (also OK if no warmup file found)
Warmed-up session side effect Session with pre-allocated resources and compiled kernels
/tensorflow/serving/model_warmup_latency metric Warmup latency recorded in monitoring

Usage Examples

Creating Warmup File (Python)

import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_log_pb2

# Create warmup file
with tf.io.TFRecordWriter("assets.extra/tf_serving_warmup_requests") as writer:
    # Create sample predict request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'my_model'
    request.model_spec.signature_name = 'serving_default'
    request.inputs['input'].CopyFrom(
        tf.make_tensor_proto([1.0, 2.0, 3.0], shape=[1, 3])
    )

    # Wrap in PredictionLog
    log = prediction_log_pb2.PredictionLog(
        predict_log=prediction_log_pb2.PredictLog(request=request)
    )
    writer.write(log.SerializeToString())

Using ResNet Warmup Generator

# Generate warmup file for ResNet model
python tensorflow_serving/example/resnet_warmup.py \
    --output_dir=/tmp/resnet/1/assets.extra

Related Pages

Implements Principle

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment