Implementation:Tensorflow Serving Batching CLI Configuration
| Knowledge Sources | |
|---|---|
| Domains | Performance, Configuration |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Concrete configuration pattern for enabling and parameterizing request batching via tensorflow_model_server CLI flags.
Description
Batching is activated through CLI flags on the tensorflow_model_server binary. The --enable_batching flag wraps all model sessions with BatchingSession. The --batching_parameters_file points to a text-format protobuf file containing BatchingParameters (max_batch_size, batch_timeout_micros, etc.). Alternatively, --enable_per_model_batching_parameters reads a batching_params.pbtxt file from each model's SavedModel assets.extra/ directory.
These flags populate the Server::Options struct which is passed to ServerCore.
Usage
Set these flags when launching tensorflow_model_server or configure the equivalent Docker environment variables. The batching parameters file is optional; sensible defaults are used when only --enable_batching is specified.
Code Reference
Source Location
- Repository: tensorflow/serving
- File: tensorflow_serving/model_servers/main.cc (L107-130 flag definitions)
- Header: tensorflow_serving/model_servers/server.h (L39-111 Options struct)
Signature
// CLI flags (defined in main.cc)
--enable_batching // bool, default: false
--batching_parameters_file // string, default: ""
--enable_per_model_batching_parameters // bool, default: false
--enable_model_warmup // bool, default: true
// Corresponding Server::Options fields
struct Server::Options {
bool enable_batching = false; // L59
bool enable_per_model_batching_params = false; // L60
string batching_parameters_file; // L64
bool enable_model_warmup = true; // L91
};
Import
# No code import — these are CLI flags on the binary
tensorflow_model_server --enable_batching=true ...
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --enable_batching | bool | No | Activates batching (default false) |
| --batching_parameters_file | string | No | Path to BatchingParameters text proto file |
| --enable_per_model_batching_parameters | bool | No | Read per-model batching params from SavedModel |
Outputs
| Name | Type | Description |
|---|---|---|
| Server::Options | struct | Populated configuration passed to ServerCore |
Usage Examples
Enable Batching with Default Parameters
tensorflow_model_server \
--port=8500 \
--model_name=my_model \
--model_base_path=/models/my_model \
--enable_batching=true
With Custom Batching Parameters File
# Create batching_params.txt
cat > /tmp/batching_params.txt << 'EOF'
max_batch_size { value: 128 }
batch_timeout_micros { value: 10000 }
num_batch_threads { value: 8 }
max_enqueued_batches { value: 1000000 }
EOF
tensorflow_model_server \
--port=8500 \
--model_name=my_model \
--model_base_path=/models/my_model \
--enable_batching=true \
--batching_parameters_file=/tmp/batching_params.txt
Docker with Batching
docker run -p 8500:8500 -p 8501:8501 \
--mount type=bind,source=/models/my_model,target=/models/my_model \
-t tensorflow/serving \
--model_name=my_model \
--enable_batching=true