Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft Onnxruntime Summary Ops

From Leeroopedia


Field Value
Implementation Name Summary_Ops
Overview TensorBoard summary operators for logging scalar, histogram, text, and merged training metrics during distributed training.
Type API Doc
Language C++
Domains Distributed_Training, Training_Infrastructure
Source Repository microsoft/onnxruntime
Last Updated 2026-02-10

Overview

TensorBoard summary operators for logging scalar, histogram, text, and merged training metrics during distributed training. These operators (SummaryScalarOp, SummaryHistogramOp, SummaryTextOp, SummaryMergeOp) are registered in the Microsoft domain and produce TensorBoard-compatible protobuf summary events.

API

// All operators implement the OpKernel::Compute interface
class SummaryScalarOp final : public OpKernel {
    Status Compute(OpKernelContext* context) const override;
};

class SummaryHistogramOp final : public OpKernel {
    Status Compute(OpKernelContext* context) const override;
};

class SummaryTextOp final : public OpKernel {
    Status Compute(OpKernelContext* context) const override;
};

class SummaryMergeOp final : public OpKernel {
    Status Compute(OpKernelContext* context) const override;
};

Source Code Reference

Key Parameters

Operator Input Type Constraint Description
SummaryScalarOp tag (Tensor) string Tag name(s) identifying the scalar summary
SummaryScalarOp input (Tensor) float, double, bool Scalar value(s) to log
SummaryHistogramOp tag (Tensor) string Tag name identifying the histogram
SummaryHistogramOp input (Tensor) float, double Tensor values for histogram distribution
SummaryTextOp tag (Tensor) string Tag name identifying the text entry
SummaryTextOp input (Tensor) string Text content to log
SummaryMergeOp inputs (Tensor[]) string Multiple serialized summary protobufs to merge

I/O Contract

Direction Name Type Description
Input tag Tensor (string) Identifying name/tag for the summary entry
Input input Tensor (varies) Data to be serialized into the summary
Output summary Tensor (string) Serialized TensorBoard summary protobuf bytes

Usage Examples

Configuring TensorBoard in TrainingRunner

// Summary ops are configured via TrainingRunner::Parameters
TrainingRunner::Parameters params;
params.log_dir = ORT_TSTR("/logs/tensorboard/");
params.summary_name = "training_summary";
params.scalar_names = {"total_loss", "learning_rate"};
params.histogram_names = {"layer1_weights", "layer2_gradients"};
params.norm_names = {"gradient_norm"};

TensorBoard Configuration in TrainingSession

// During Initialize(), TensorBoard is configured on the training session
if (params_.EnableTensorboard()) {
    TrainingSession::TrainingConfiguration::TensorboardConfiguration tb{};
    tb.summary_name = params_.summary_name;
    tb.scalar_node_names = params_.scalar_names;
    tb.histogram_node_names = params_.histogram_names;
    tb.norm_node_names = params_.norm_names;
    tb.dump_convergence_metrics = params_.dump_convergence_metrics;
    config.tensorboard_config = tb;
}

Viewing TensorBoard Output

# Launch TensorBoard to view logged training metrics
tensorboard --logdir /logs/tensorboard/

Operator Registration

The summary operators are registered in the Microsoft domain (kMSDomain) using the ONNX Runtime kernel registration macros:

ONNX_OPERATOR_KERNEL_EX(
    SummaryScalar, kMSDomain, 1, kCpuExecutionProvider,
    KernelDefBuilder()
        .TypeConstraint("T", {DataTypeImpl::GetTensorType<float>(),
                              DataTypeImpl::GetTensorType<double>(),
                              DataTypeImpl::GetTensorType<bool>()})
        .TypeConstraint("S", DataTypeImpl::GetTensorType<std::string>()),
    SummaryScalarOp);

ONNX_OPERATOR_KERNEL_EX(
    SummaryHistogram, kMSDomain, 1, kCpuExecutionProvider,
    KernelDefBuilder()
        .TypeConstraint("T", {DataTypeImpl::GetTensorType<float>(),
                              DataTypeImpl::GetTensorType<double>()})
        .TypeConstraint("S", DataTypeImpl::GetTensorType<std::string>()),
    SummaryHistogramOp);

Key Details

  • All summary operators run on the kCpuExecutionProvider regardless of the training device, since serialization is a CPU operation.
  • SummaryScalarOp maintains a tags_ vector for multiple scalar entries in a single op invocation.
  • SummaryHistogramOp and SummaryTextOp each have a single tag_ string.
  • The operators use TensorBoard's native protobuf format (tensorboard/compat/proto/summary.pb.h) for serialization.
  • SummaryMergeOp concatenates multiple summary protobufs into a single merged summary for efficient event writing.
  • TensorBoard logging only occurs on rank 0 (checked by EnableTensorboard() which verifies MPIContext::GetInstance().GetWorldRank() == 0).
  • The implementation is adapted from TensorFlow's summary ops (licensed under Apache 2.0) with modifications for ONNX Runtime.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment