Implementation:Tensorflow Serving Server BuildAndStart

Knowledge Sources	TensorFlow Serving TF Serving Docker
Domains	Deployment, Infrastructure
Last Updated	2026-02-13 17:00 GMT

Overview

Concrete tool for initializing and starting the TensorFlow Serving server with gRPC and HTTP endpoints provided by the tensorflow_model_server binary.

Description

The Server class in server.h/server.cc provides the high-level orchestration for TensorFlow Serving. BuildAndStart() creates a ServerCore with the specified model configuration, initializes gRPC services (PredictionService, ModelService, Profiler), optionally creates an HTTP server for REST API, and returns when all endpoints are ready for requests.

The entry point main() in main.cc parses CLI flags, populates the Server::Options struct, calls BuildAndStart(), and then WaitForTermination() to block until shutdown.

Usage

Use this to serve any SavedModel or set of models. Invoke as a binary with CLI flags or as a Docker container with environment variables.

Code Reference

Source Location

Repository: tensorflow/serving
File: tensorflow_serving/model_servers/main.cc (L82-349), tensorflow_serving/model_servers/server.cc (L182-462)
Header: tensorflow_serving/model_servers/server.h (L37-141)

Signature

namespace tensorflow::serving::main {

class Server {
 public:
  struct Options {
    // gRPC Server options
    int32 grpc_port = 8500;
    string grpc_channel_arguments;
    string grpc_socket_path;
    int32 grpc_max_threads = 4 * NumSchedulableCPUs();

    // HTTP Server options
    int32 http_port = 0;
    int32 http_num_threads = 4 * NumSchedulableCPUs();
    int32 http_timeout_in_ms = 30000;

    // Model Server options
    bool enable_batching = false;
    string batching_parameters_file;
    string model_name;
    string model_base_path;
    int32 num_load_threads = 0;
    int32 max_num_load_retries = 5;
    int32 file_system_poll_wait_seconds = 1;
    bool enable_model_warmup = true;
    string model_config_file;
    string saved_model_tags;
    // ... additional options
  };

  Status BuildAndStart(const Options& server_options);
  void WaitForTermination();
};

}  // namespace tensorflow::serving::main

Import

#include "tensorflow_serving/model_servers/server.h"

I/O Contract

Inputs

Name	Type	Required	Description
server_options	Server::Options	Yes	Configuration struct with all server parameters
--port	int32	No	CLI flag, gRPC port (default 8500)
--rest_api_port	int32	No	CLI flag, HTTP port (default 0 = disabled)
--model_name	string	No	CLI flag, model name (default "default")
--model_base_path	string	Yes*	CLI flag, path to SavedModel base directory
--enable_batching	bool	No	CLI flag (default false)
--model_config_file	string	Yes*	CLI flag, alternative to model_base_path for multi-model

Outputs

Name	Type	Description
gRPC server	grpc::Server	Listening on --port for PredictionService and ModelService RPCs
HTTP server	HTTPServerInterface	Listening on --rest_api_port for REST API (if enabled)
ServerCore	ServerCore	Loaded models ready for inference

Usage Examples

Docker Deployment

# Serve a single model with Docker
docker run -p 8501:8501 \
    --mount type=bind,source=/path/to/my_model,target=/models/my_model \
    -e MODEL_NAME=my_model \
    -t tensorflow/serving

# With gRPC and REST
docker run -p 8500:8500 -p 8501:8501 \
    --mount type=bind,source=/path/to/my_model,target=/models/my_model \
    -e MODEL_NAME=my_model \
    -t tensorflow/serving

Direct Binary Invocation

# Serve a single model
tensorflow_model_server \
    --port=8500 \
    --rest_api_port=8501 \
    --model_name=mnist \
    --model_base_path=/tmp/mnist_model

# Serve with batching enabled
tensorflow_model_server \
    --port=8500 \
    --model_name=mnist \
    --model_base_path=/tmp/mnist_model \
    --enable_batching=true \
    --batching_parameters_file=/tmp/batching_params.txt

# Serve multiple models via config file
tensorflow_model_server \
    --port=8500 \
    --model_config_file=/tmp/models.config

Related Pages

Implements Principle

Principle:Tensorflow_Serving_Server_Configuration_And_Startup

Requires Environment

Uses Heuristic

Heuristic:Tensorflow_Serving_GPU_Memory_And_CPU_Optimization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment