Principle:LaurentMazare Tch rs Dataset Iteration

Knowledge Sources	tch-rs
Domains	Deep_Learning, Data_Loading
Last Updated	2026-02-08 14:00 GMT

Overview

Pattern for iterating over datasets in mini-batches with optional shuffling and device transfer for training loops.

Description

Dataset iteration provides a way to split large datasets into smaller mini-batches for stochastic gradient descent training. Each iteration yields a pair of tensors (inputs, labels) of the specified batch size. The iterator supports shuffling (randomizing sample order each epoch) and device transfer (moving batches to GPU). The last batch may be smaller than the requested batch size. This is the standard data feeding mechanism for all training workflows in tch-rs.

Usage

Use in every training loop to iterate over training data in mini-batches. Call .shuffle() for randomized sample ordering and .to_device(device) for GPU transfer.

Theoretical Basis

Mini-batch Training:
  For each epoch:
    For each batch (xs, ys) in dataset.train_iter(batch_size).shuffle():
      logits = model.forward(xs)
      loss = loss_fn(logits, ys)
      optimizer.backward_step(loss)

Iter2 API chain:
  dataset.train_iter(64)         → Basic iterator
    .shuffle()                   → Random ordering
    .to_device(Device::Cuda(0))  → GPU transfer
    .return_smaller_last_batch() → Include partial final batch

Related Pages

Implemented By

Implementation:LaurentMazare_Tch_rs_Dataset_Train_Iter

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment