Implementation:Triton inference server Server Tracer
| Knowledge Sources | |
|---|---|
| Domains | Observability, Tracing |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Concrete tool for inference request tracing with support for both Triton-native file-based tracing and OpenTelemetry distributed tracing backends.
Description
The TraceManager class provides the complete tracing infrastructure for Triton. It supports two trace modes: Triton-native (writing JSON trace events to files) and OpenTelemetry (using OTLP HTTP exporters with batch span processors). The manager handles per-model and global trace configurations with runtime updates, trace sampling via rate/count controls, and OpenTelemetry context propagation across request spans using an AbstractCarrier interface. Thread safety is achieved through reader-writer mutex patterns.
Usage
Activated by Triton's command-line trace options (--trace-config, --trace-file, etc.) or runtime trace API. Used when operators need performance analysis, latency profiling, or distributed tracing integration.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: src/tracer.h
- Lines: 1-527
- File: src/tracer.cc
- Lines: 1-1260
Signature
class TraceManager {
public:
// Trace modes
enum TraceMode { TRITON, OPENTELEMETRY };
struct TraceSetting {
TraceMode mode_;
std::string trace_file_;
TRITONSERVER_InferenceTraceLevel level_;
uint32_t rate_;
int32_t count_;
int32_t log_frequency_;
};
TraceManager(const TraceSetting& global_setting);
~TraceManager();
// Trace lifecycle
TRITONSERVER_InferenceTrace* SampleTrace(const std::string& model_name);
void UpdateTraceSetting(
const std::string& model_name, const TraceSetting& setting);
// OpenTelemetry context propagation
class Trace {
public:
void StartSpan(const std::string& name);
void EndSpan();
};
class AbstractCarrier {
public:
virtual std::string Get(const std::string& key) = 0;
virtual void Set(const std::string& key, const std::string& value) = 0;
};
static void TraceActivity(
TRITONSERVER_InferenceTrace* trace,
TRITONSERVER_InferenceTraceActivity activity,
uint64_t timestamp_ns, void* userp);
};
Import
#include "tracer.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| global_setting | TraceSetting | Yes | Global trace configuration |
| model_name | string | No | Model-specific trace setting override |
| rate | uint32_t | No | Sample every N-th request (0 = all) |
| count | int32_t | No | Max number of traces to collect (-1 = unlimited) |
Outputs
| Name | Type | Description |
|---|---|---|
| trace_file | JSON file | Triton-native trace events (timestamps, activities) |
| OTLP spans | HTTP | OpenTelemetry spans exported via OTLP |
Usage Examples
Enable Triton-Native Tracing
tritonserver \
--model-repository=/models \
--trace-config triton,file=/tmp/trace.json \
--trace-config rate=100 \
--trace-config level=TIMESTAMPS
Enable OpenTelemetry Tracing
tritonserver \
--model-repository=/models \
--trace-config opentelemetry,url=http://localhost:4318/v1/traces \
--trace-config opentelemetry,resource=service.name=triton-server \
--trace-config level=TIMESTAMPS