Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server Classification

From Leeroopedia
Knowledge Sources
Domains Inference, Post_Processing
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete tool for extracting top-k classification results from inference output tensors, converting raw model outputs to human-readable class labels with probabilities.

Description

The TopkClassifications function provides server-side classification post-processing. It takes a raw inference output tensor, sorts elements by probability in descending order, and returns the top-k results as formatted strings with probability values, indices, and optional label names. The implementation uses a templated AddClassResults helper that handles all numeric data types supported by Triton.

Usage

Used by Triton's HTTP and gRPC server endpoints when a client requests classification output (e.g., setting classification count in the inference request). Triggered automatically when the output has an associated label file and the request specifies a classification count.

Code Reference

Source Location

Signature

namespace triton { namespace server {

TRITONSERVER_Error* TopkClassifications(
    TRITONSERVER_InferenceResponse* response,
    uint32_t output_idx,
    const void* output_base,
    TRITONSERVER_DataType datatype,
    uint32_t class_count,
    std::vector<std::string>* class_results);

}} // namespace triton::server

Import

#include "classification.h"

I/O Contract

Inputs

Name Type Required Description
response TRITONSERVER_InferenceResponse* Yes The inference response containing output metadata
output_idx uint32_t Yes Index of the output tensor
output_base const void* Yes Pointer to raw output tensor data
datatype TRITONSERVER_DataType Yes Data type of the output tensor
class_count uint32_t Yes Number of top-k classes to return

Outputs

Name Type Description
class_results vector<string> Formatted strings: "probability:index:label"

Usage Examples

Server-Side Classification

#include "classification.h"

// After inference completes
std::vector<std::string> results;
auto err = TopkClassifications(
    response,
    0,            // first output
    output_data,  // raw tensor pointer
    TRITONSERVER_TYPE_FP32,
    5,            // top-5
    &results);

// results[0] might be: "0.95:281:tabby cat"
// results[1] might be: "0.03:282:tiger cat"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment