Implementation:Pyro ppl Pyro SV DKL
Appearance
| Property | Value |
|---|---|
| Implementation Type | Pattern Doc |
| Source File | examples/contrib/gp/sv-dkl.py
|
| Module | pyro.contrib.gp |
| Pyro Features | pyro.contrib.gp.kernels.Warping, pyro.contrib.gp.models.VariationalSparseGP, pyro.contrib.gp.kernels.RBF, pyro.contrib.gp.likelihoods.Binary, pyro.contrib.gp.likelihoods.MultiClass, TraceMeanField_ELBO
|
| Paper | Wilson et al. (2016), "Stochastic Variational Deep Kernel Learning" |
| Dataset | MNIST |
Overview
This file demonstrates Stochastic Variational Deep Kernel Learning, which combines a Convolutional Neural Network (CNN) with a Gaussian Process (GP) for image classification on MNIST. The key idea is to create a "deep kernel" by warping an RBF kernel with a CNN feature extractor.
The architecture:
- CNN: Two convolutional layers followed by two fully-connected layers, mapping 28x28 images to 10-dimensional feature vectors.
- Deep Kernel:
gp.kernels.Warping(rbf, iwarping_fn=cnn)composes the RBF kernel with the CNN, so the kernel operates in the CNN's learned feature space. - VariationalSparseGP: Enables mini-batch training with inducing points that lie in the original image space (not the feature space).
- Likelihoods: Binary likelihood for binary classification, MultiClass (Categorical) for 10-class MNIST.
The model achieves ~98.45% accuracy on 10-class MNIST and ~99.41% on binary MNIST after 16 epochs.
Code Reference
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
self.fc1 = nn.Linear(320, 50)
self.fc2 = nn.Linear(50, 10)
# Create deep kernel
rbf = gp.kernels.RBF(input_dim=10, lengthscale=torch.ones(10))
deep_kernel = gp.kernels.Warping(rbf, iwarping_fn=cnn)
# Create sparse variational GP
gpmodule = gp.models.VariationalSparseGP(
X=Xu, y=None, kernel=deep_kernel, Xu=Xu,
likelihood=likelihood, latent_shape=latent_shape,
num_data=60000, whiten=True, jitter=2e-6)
# Training loop
optimizer = torch.optim.Adam(gpmodule.parameters(), lr=args.lr)
elbo = infer.TraceMeanField_ELBO()
loss_fn = elbo.differentiable_loss
I/O Contract
| Parameter | Type | Description |
|---|---|---|
--num-inducing |
int |
Number of inducing points (default: 70) |
--binary |
flag | Binary classification (odd/even digits) |
--batch-size |
int |
Training batch size (default: 64) |
--epochs |
int |
Number of training epochs (default: 10) |
--lr |
float |
Learning rate (default: 0.01) |
--jit |
flag | Use JIT-compiled ELBO |
Output:
- Test accuracy per epoch
- Training time per epoch
Usage Examples
# 10-class MNIST classification
# python sv-dkl.py --epochs 16 --lr 0.01
# Binary classification (odd vs even)
# python sv-dkl.py --binary --epochs 10
# With GPU and JIT
# python sv-dkl.py --cuda --jit
Related Pages
- Pyro_ppl_Pyro_GP_BayesOpt - GP-based Bayesian optimization
- Pyro_ppl_Pyro_GP_TimeSeries - GP-based time series models
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment