Principle:LaurentMazare Tch rs Vendored Image IO

Knowledge Sources	LaurentMazare_Tch_rs stb Libraries public domain libraries Sean Barrett
Domains	Image Processing, Vendored Dependencies, C Libraries
Last Updated	2026-02-08 00:00 GMT

Overview

Vendored single-header C libraries for image input/output provide self-contained, dependency-free image loading, saving, and resizing capabilities that can be embedded directly into a project's source tree.

Description

Image I/O is a fundamental requirement for machine learning and computer vision systems -- training data must be loaded from image files, and results must be saved back. However, image format handling (JPEG, PNG, BMP, etc.) typically requires external system libraries (libjpeg, libpng, zlib), creating dependency management challenges:

Libraries may not be installed on the target system.
Version mismatches can cause subtle bugs or build failures.
Cross-compilation becomes harder with each external dependency.
Reproducible builds require pinning exact library versions.

Vendored single-header libraries solve this by embedding the entire image I/O implementation directly in the project. The "single-header" pattern means the library is distributed as a single C header file that contains both declarations and implementation. By defining an implementation macro before including the header in exactly one source file, the library is compiled as part of the project with zero external dependencies.

The approach typically provides three capabilities:

Image Loading

Reads image files in multiple formats (JPEG, PNG, BMP, GIF, TGA, PSD, HDR) and decodes them into a raw pixel buffer. The caller receives:

A pointer to pixel data in row-major order.
Image width and height.
Number of color channels (1 for grayscale, 3 for RGB, 4 for RGBA).

The caller can also request a specific number of output channels, and the library will convert automatically (e.g., RGB to grayscale, or adding/removing alpha).

Image Saving

Encodes a raw pixel buffer into a specific image format and writes it to a file. Supported output formats typically include PNG, BMP, TGA, and JPEG (with configurable quality for lossy formats).

Image Resizing

Resamples an image to a different resolution using high-quality filtering. This is essential for machine learning pipelines that require fixed-size inputs (e.g., 224x224 for ImageNet models, 84x84 for Atari agents).

Usage

Vendored image I/O is appropriate when:

Minimizing external dependencies is a project goal, particularly for libraries that should be easy to build from source.
Cross-platform compatibility is needed without requiring users to install system image libraries.
Tensor-image interoperability is required -- loading images directly into tensor memory layouts for neural network input.
Build simplicity is valued -- the image library compiles as part of the project with no additional build configuration.
Reproducibility is critical -- the exact image decoding behavior is determined by the vendored source, not by whatever version of libjpeg happens to be installed.

The trade-off is that vendored libraries may not support all features of dedicated libraries (e.g., progressive JPEG, ICC color profiles) and may have different performance characteristics.

Theoretical Basis

Single-Header Library Pattern

The single-header pattern uses the C preprocessor to combine declaration and implementation in one file:

// In the header file (stb_image.h):
#ifndef INCLUDE_GUARD
#define INCLUDE_GUARD

// Declarations (always available)
DECLARE load_image(filename, width, height, channels, desired_channels) -> pixel_data
DECLARE free_image(pixel_data)

#ifdef IMPLEMENTATION_MACRO
// Implementation (compiled only once)
FUNCTION load_image(...):
    // ... full implementation ...
#endif

#endif

Usage in a project:

// file_a.c - just uses declarations
#include "stb_image.h"

// file_b.c - compiles the implementation
#define IMPLEMENTATION_MACRO
#include "stb_image.h"

This ensures the implementation is compiled exactly once (in file_b.c) while declarations are available everywhere.

Image Data Layout

Loaded images follow a standard memory layout:

For an image of width $W$ , height $H$ , and $C$ channels, the pixel at position $(x, y)$ in channel $c$ is located at:

$offset = (y \times W + x) \times C + c$

The total buffer size is $W \times H \times C$ bytes (for 8-bit images) or $W \times H \times C \times sizeof(float)$ for HDR images.

Channels are interleaved (RGBRGBRGB...) rather than planar (RRR...GGG...BBB...). Machine learning frameworks typically expect planar layout, so a transpose operation is needed:

FUNCTION interleaved_to_planar(data, W, H, C):
    FOR c IN 0..C:
        FOR y IN 0..H:
            FOR x IN 0..W:
                planar[c * H * W + y * W + x] = data[(y * W + x) * C + c]

Image Resizing Theory

High-quality image resizing applies a reconstruction filter to resample the image at new coordinates. The process involves:

Upsampling conceptually treats the source image as a continuous signal by interpolating between discrete pixel values.
Filtering applies a low-pass filter to prevent aliasing when downsampling.
Resampling evaluates the filtered signal at the target pixel coordinates.

Common filter kernels include:

Filter	Support	Quality	Speed
Box (nearest neighbor)	0.5	Low (blocky)	Fastest
Bilinear (triangle)	1.0	Medium	Fast
Catmull-Rom (cubic)	2.0	Good	Moderate
Mitchell-Netravali	2.0	Good (less ringing)	Moderate
Lanczos	3.0	Excellent	Slowest

For a 1D resize from size $N$ to size $M$ , the output pixel $j$ is computed as:

$out [j] = \sum_{i} w (i, j) \cdot in [i]$

where the weights $w (i, j)$ are determined by the filter kernel evaluated at the distance between the source and target positions. 2D resizing is separable: resize horizontally first, then vertically (or vice versa).

Format-Specific Considerations

Format	Compression	Quality	Alpha	Use Case
JPEG	Lossy (DCT)	Configurable (1-100)	No	Photographs
PNG	Lossless (DEFLATE)	Perfect	Yes	Screenshots, graphics
BMP	None	Perfect	Optional	Simple interchange
TGA	Optional RLE	Perfect	Yes	Legacy graphics
HDR	Radiance RGBE	Float precision	No	High dynamic range

Each format requires a different decoding algorithm, but the vendored library abstracts this behind a unified loading interface that auto-detects the format from file headers.

Related Pages

Implementation:LaurentMazare_Tch_rs_Stb_Image

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment