Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Pytorch Serve Model Server Start

From Leeroopedia

Overview

Model Server Start encompasses the two primary mechanisms for starting TorchServe: the CLI entry point model_server.start() that launches the Java frontend as a subprocess, and the programmatic launcher.start() function designed for integration testing and embedded usage. Both manage PID files, configure Java classpath and JVM arguments, and coordinate the Python-to-Java process handoff.

Field Value
Implementation Name Model Server Start
Type API Doc
Workflow Model_Deployment
Domains Infrastructure, Model_Serving
Knowledge Sources TorchServe
Last Updated 2026-02-13 00:00 GMT

Description

TorchServe provides two distinct start mechanisms:

CLI Start (model_server.start())

This is the entry point registered as the torchserve console script. It:

  1. Parses CLI arguments via ArgParser.ts_parser().
  2. Checks for an existing running instance via the PID file at {tempdir}/.model_server.pid.
  3. Handles --version and --stop flags.
  4. Constructs the Java command with:
    • -Dmodel_server_home pointing to the TorchServe installation directory.
    • Log4j configuration from --log-config.
    • Classpath including ts/frontend/* JARs and plugin directories.
    • JVM arguments from config.properties (via the vmargs property).
  1. Launches the Java process via subprocess.Popen.
  2. Writes the PID to the PID file.
  3. Optionally blocks in foreground mode via process.wait().

Programmatic Start (launcher.start())

This function is designed for programmatic control and integration tests:

  1. Stops any existing TorchServe instance by calling stop().
  2. Constructs the torchserve --start command with the provided parameters.
  3. Launches via subprocess.Popen with stdout/stderr capture.
  4. Blocks until "Model server started" appears in the output.
  5. Spawns a Tee thread that splits the log stream into two queues: one for printing and one returned to the caller.
  6. Returns the log Queue for the caller to consume.

Stop Function (launcher.stop())

A simple wrapper that runs torchserve --stop, optionally with --foreground for synchronous shutdown.

Usage

from ts.launcher import start, stop

Code Reference

Source Location

File Lines Description Repository
ts/model_server.py L23-223 CLI start entry point pytorch/serve
ts/launcher.py L69-104 Programmatic start pytorch/serve
ts/launcher.py L10-14 Stop function pytorch/serve

Signature

model_server.start()

def start() -> None:
    """
    CLI entry point for TorchServe.

    Parses command-line arguments, checks for existing running instances,
    constructs the Java frontend command, and launches the server process.

    Handles --version, --stop, --start, --foreground flags.

    CLI flags:
        --model-store (str): Required. Path to model store directory.
        --ts-config (str): Path to TorchServe config.properties file.
        --log-config (str): Path to log4j configuration file.
        --models (list): Models to load at startup.
        --workflow-store (str): Path to workflow store directory.
        --plugins-path (str): Path to plugin JARs directory.
        --no-config-snapshots: Disable configuration snapshots.
        --disable-token-auth: Disable token-based authentication.
        --enable-model-api: Enable model management API.
        --foreground: Run server in foreground (blocking).
        --stop: Stop a running TorchServe instance.
        --version: Print version and exit.

    Returns:
        None

    Raises:
        SystemExit: On invalid arguments or missing dependencies.
    """
    ...

launcher.start()

def start(
    model_store: str = None,
    snapshot_file: str = None,
    no_config_snapshots: bool = False,
    plugin_folder: str = None,
    disable_token: bool = False,
    models: str = None,
    enable_model_api: bool = False,
) -> Queue:
    """
    Start TorchServe programmatically.

    Stops any existing instance, then launches TorchServe and blocks
    until the server is ready ("Model server started" in output).

    Args:
        model_store (str): Path to the model store directory.
        snapshot_file (str): Path to config/snapshot file (--ts-config).
        no_config_snapshots (bool): Disable config snapshots.
        plugin_folder (str): Path to plugins directory.
        disable_token (bool): Disable token authentication.
        models (str): Model to load at startup (--models flag).
        enable_model_api (bool): Enable model management API.

    Returns:
        Queue: A queue that receives server log lines as strings.
               Receives None when the server process ends.
    """
    ...

launcher.stop()

def stop(wait: bool = True) -> None:
    """
    Stop a running TorchServe instance.

    Args:
        wait (bool): If True, run with --foreground to wait for
                     the server to fully stop. Default True.
    """
    ...

Import

from ts.launcher import start, stop

I/O Contract

launcher.start()

Parameter Type Required Default Description
model_store str Yes None Path to the directory containing .mar files
snapshot_file str No None Path to config.properties or snapshot file
no_config_snapshots bool No False Disable saving configuration snapshots
plugin_folder str No None Path to directory with plugin JARs
disable_token bool No False Disable token-based authentication
models str No None Model name or URL to preload
enable_model_api bool No False Enable the model management API
Return Type Description
Log queue Queue Queue of log lines (str); None signals end of stream

launcher.stop()

Parameter Type Required Default Description
wait bool No True Block until the server fully stops

PID File

Attribute Value
Location {tempfile.gettempdir()}/.model_server.pid
Content Single line containing the Java process PID as an integer
Created After successful subprocess.Popen during start
Removed After successful stop or when orphaned PID detected

Usage Examples

Example 1: CLI start and stop

# Start TorchServe with model preloading
torchserve --start \
  --model-store /home/model_store \
  --models squeezenet=squeezenet1_1.mar \
  --disable-token-auth

# Check version
torchserve --version
# Output: TorchServe Version is 0.11.1

# Stop TorchServe
torchserve --stop

Example 2: Programmatic start for testing

from ts.launcher import start, stop
import requests


# Start TorchServe
log_queue = start(
    model_store="/tmp/model_store",
    no_config_snapshots=True,
    disable_token=True,
    models="squeezenet=squeezenet1_1.mar",
)

# Server is now running - make inference requests
response = requests.post(
    "http://localhost:8080/predictions/squeezenet",
    files={"data": open("kitten.jpg", "rb")},
)
print(response.json())

# Stop when done
stop(wait=True)

Example 3: Start with custom configuration

from ts.launcher import start, stop

# config.properties contents:
# inference_address=http://0.0.0.0:8080
# management_address=http://0.0.0.0:8081
# metrics_address=http://0.0.0.0:8082
# number_of_netty_threads=32
# job_queue_size=1000

log_queue = start(
    model_store="/opt/models",
    snapshot_file="/opt/config/config.properties",
    plugin_folder="/opt/plugins",
    enable_model_api=True,
)

Example 4: Handling the log queue

from ts.launcher import start
import threading


log_queue = start(model_store="/tmp/model_store")


def consume_logs(queue):
    """Consume server logs in a background thread."""
    while True:
        line = queue.get()
        if line is None:
            break
        if "ERROR" in str(line):
            print(f"SERVER ERROR: {line.strip()}")


log_thread = threading.Thread(target=consume_logs, args=(log_queue,))
log_thread.start()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment