Implementation:Tensorflow Serving MNIST Input Data
| Knowledge Sources | |
|---|---|
| Domains | Data Loading, Example |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
A Python utility module for downloading, extracting, and managing MNIST handwritten digit dataset files, providing a DataSet class with batched iteration for training, validation, and test splits.
Description
This module provides end-to-end MNIST data loading functionality. maybe_download() downloads gzipped IDX files from a Google Cloud Storage mirror if they do not already exist locally. extract_images() reads the IDX3 image file format, validating the magic number (2051) and extracting images into a 4D numpy array [index, y, x, depth]. extract_labels() reads the IDX1 label file format (magic number 2049) into a 1D array, with optional one-hot encoding via dense_to_one_hot(). The DataSet class wraps images and labels, performing preprocessing (reshaping from [N, rows, cols, 1] to [N, rows*cols] and normalizing from [0, 255] to [0.0, 1.0]). It provides next_batch() for mini-batch iteration with automatic epoch tracking and shuffling. A fake_data mode generates synthetic data for testing. read_data_sets() orchestrates the full pipeline: downloading all four files, extracting them, and splitting training data into 55000 training and 5000 validation examples. Constants define the source URL, file names, and validation size.
Usage
Use this module in TensorFlow Serving example scripts to load the MNIST dataset for training or testing models that will be served via TensorFlow Serving.
Code Reference
Source Location
- Repository: Tensorflow_Serving
- File:
tensorflow_serving/example/mnist_input_data.py - Lines: 1-206
Signature
def maybe_download(filename, work_directory): ...
def extract_images(filename): ...
def dense_to_one_hot(labels_dense, num_classes=10): ...
def extract_labels(filename, one_hot=False): ...
class DataSet(object):
def __init__(self, images, labels, fake_data=False, one_hot=False): ...
def next_batch(self, batch_size, fake_data=False): ...
@property
def images(self): ...
@property
def labels(self): ...
@property
def num_examples(self): ...
def read_data_sets(train_dir, fake_data=False, one_hot=False): ...
Import
from tensorflow_serving.example import mnist_input_data
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| train_dir | str |
Yes (read_data_sets) | Directory for downloading and caching MNIST files |
| fake_data | bool |
No | If True, generates synthetic data (default False) |
| one_hot | bool |
No | If True, labels are one-hot encoded (default False) |
| batch_size | int |
Yes (next_batch) | Number of examples per batch |
Outputs
| Name | Type | Description |
|---|---|---|
| read_data_sets() | DataSets |
Object with .train, .validation, .test DataSet attributes |
| next_batch() | tuple |
(images_batch, labels_batch) as numpy arrays |
| extract_images() | numpy.ndarray |
4D uint8 array [index, y, x, depth] |
| extract_labels() | numpy.ndarray |
1D uint8 array [index] or 2D one-hot array |
Usage Examples
Loading MNIST Data
from tensorflow_serving.example import mnist_input_data
mnist = mnist_input_data.read_data_sets('/tmp/mnist_data', one_hot=True)
# Training loop
for step in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
# Train the model with batch_xs and batch_ys
# Evaluate
test_images = mnist.test.images
test_labels = mnist.test.labels