Principle:LaurentMazare Tch rs Tensor Pretty Printing
| Knowledge Sources | |
|---|---|
| Domains | Tensor Computing, Display Formatting |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Tensor pretty printing is the process of converting multi-dimensional numerical arrays into human-readable string representations that convey structure, data type, and values at a glance.
Description
When working with tensors (multi-dimensional arrays), raw data dumps are unreadable. Tensor pretty printing solves this by applying a set of formatting rules that mirror how a human would write down a matrix or higher-order array on paper.
The formatting pipeline involves several stages:
- Element classification - Determine whether the tensor contains integers, floating-point values, or other types. This dictates whether decimal points and trailing zeros appear.
- Precision selection - For floating-point tensors, inspect all values to decide how many decimal places are needed to faithfully represent the data without excess noise. If values span many orders of magnitude, scientific notation may be used instead.
- Summarization - Large tensors cannot be printed in full. When a dimension exceeds a threshold (commonly around 6-12 elements), only the first and last few elements are shown, separated by an ellipsis marker (e.g., "...") to indicate omitted data.
- Structural annotation - After the data, metadata such as shape (the size of each dimension) and dtype (the element data type, e.g., Float32, Int64) is appended so the reader knows the tensor's properties without inspecting it programmatically.
- Indentation and alignment - Nested brackets and column-aligned numbers make the dimensional structure visually apparent. Each nesting level corresponds to one tensor dimension.
This convention was popularized by NumPy and adopted by PyTorch, becoming the de facto standard for tensor display in scientific computing environments.
Usage
Apply tensor pretty printing whenever tensors need to be displayed in:
- Debugging output - Inspecting intermediate computation results during model development.
- Logging - Recording tensor summaries without overwhelming log files.
- REPL / interactive sessions - Providing immediate feedback when evaluating tensor expressions.
- Error messages - Including tensor state in diagnostic output when operations fail.
The formatting should be automatic (triggered by standard display/print mechanisms) and configurable (allowing users to adjust precision, threshold, and line width).
Theoretical Basis
The formatting algorithm can be decomposed into the following logical steps:
Element Type Classification
Given a tensor with elements :
- If all elements satisfy and , classify as integer-like and display without decimal points.
- Otherwise, classify as floating-point and proceed to precision selection.
Precision Selection
For floating-point tensors, compute the range of magnitudes:
Failed to parse (syntax error): {\displaystyle \text{max\_abs} = \max_i |t_i|, \quad \text{min\_abs} = \min_{t_i \neq 0} |t_i|}
- If Failed to parse (syntax error): {\displaystyle \text{max\_abs} / \text{min\_abs} > 1000} or Failed to parse (syntax error): {\displaystyle \text{max\_abs} > 10^8} or Failed to parse (syntax error): {\displaystyle \text{min\_abs} < 10^{-4}} , use scientific notation with a fixed number of significant digits (typically 4).
- Otherwise, use fixed-point notation. The number of decimal places is chosen as the minimum needed such that no two distinct values in the tensor produce the same printed string.
Summarization Algorithm
For a tensor dimension of size with a threshold :
IF d > 2 * tau THEN
display elements [0 .. tau-1]
display ellipsis "..."
display elements [d-tau .. d-1]
ELSE
display all elements [0 .. d-1]
This rule applies recursively at each dimension level, producing a nested summarization for high-dimensional tensors.
Output Structure
The final printed form follows the pattern:
[[ value, value, ..., value ], [ value, value, ..., value ], ... [ value, value, ..., value ]] shape: [dim0, dim1, ...], dtype: <type>
Column widths are determined by the widest formatted element in each column to ensure vertical alignment.