Principle:Kornia Kornia Augmentation Pipeline
| Knowledge Sources | |
|---|---|
| Domains | Vision, Augmentation, Deep_Learning |
| Last Updated | 2026-02-09 15:00 GMT |
Overview
Technique of composing multiple augmentation transforms into a single sequential pipeline that jointly transforms images and associated annotations.
Description
Augmentation pipelines chain individual transforms into a coherent sequence. The pipeline applies transforms sequentially, tracking transformation matrices for invertibility.
Crucially, the pipeline can jointly transform multiple data types (images, masks, bounding boxes, keypoints) using the same random parameters, maintaining spatial consistency. This ensures:
- Segmentation masks align with augmented images after geometric transforms
- Bounding boxes remain valid and correctly positioned after spatial augmentation
- Keypoints are transformed consistently with the image they annotate
Without a unified pipeline, applying transforms independently to images and annotations would produce misaligned data, corrupting the training signal.
Usage
Use when building training pipelines that require consistent augmentation across images and their annotations (masks, boxes, keypoints). Preferred over applying transforms individually, which risks spatial inconsistency between data types.
Theoretical Basis
Given transforms T_1, T_2, ..., T_n, the pipeline computes:
# Sequential transform application
output = T_n(... T_2(T_1(x)) ...)
For associated data d with data key k, the same parameters are applied:
# Consistent transform across data types
x_aug = T_i(x) # image transform
d_aug = T_i(d, key=k) # annotation transform with same params
The composed transformation matrix M = T_n * ... * T_2 * T_1 enables inverse mapping:
# Inverse transform via composed matrix
x_original_approx = M_inverse @ x_augmented
This invertibility is essential for mapping predictions made on augmented images back to the original coordinate space (e.g., test-time augmentation with prediction averaging).