Heuristic:SeldonIO Seldon core Tracing Latency Tip
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Observability |
| Last Updated | 2026-02-13 14:00 GMT |
Overview
Latency reduction tip: disable or reduce OpenTelemetry tracing ratio in production to avoid unnecessary middleware overhead on every inference request.
Description
Seldon Core 2 includes OpenTelemetry tracing integration for distributed observability. However, the tracing middleware is always applied to the request path even when tracing is effectively disabled via configuration. This is a known issue documented in a TODO comment in the pipeline HTTP server code. Additionally, profiling settings (block rate, mutex rate) add significant performance overhead when enabled.
Usage
Use this heuristic when optimizing inference latency in production deployments. If tracing is not actively needed, set the trace ratio to 0 to minimize overhead. Never enable block or mutex profiling in production.
The Insight (Rule of Thumb)
- Action 1: Set tracing `ratio` to `0` in Helm values if tracing is not needed.
- Action 2: Never set `blockRate > 0` or `mutexRate > 0` in production.
- Action 3: If tracing is needed, use a low ratio (e.g., `0.01` for 1% sampling) instead of `1.0` (100%).
- Value: Default trace ratio is `1` (100% sampling). Set to `0` for no tracing.
- Trade-off: Disabling tracing loses distributed request visibility but reduces per-request latency.
Reasoning
Known issue from `scheduler/pkg/kafka/pipeline/httpserver.go:113`:
// TODO we seem to always enforce tracing middleware even if tracing is
// not enabled via configmap?? needless latency
This means even with tracing disabled, the middleware adds overhead to every request.
Profiling performance warnings from `scheduler/cmd/scheduler/main.go:456-459`:
if blockRate > 0 {
log.Warn("Block rate > 0 - performance will be affected")
}
if mutexRate > 0 {
log.Warn("Mutex rate > 0 - performance will be affected")
}
Configuration from `docs-gb/getting-started/configuration.md`:
- Tracing ratio: 0-1 inclusive, fraction of requests to trace
- Default OpenTelemetry endpoint: `seldon-collector.seldon-mesh:4317`
- Protocol: gRPC