Implementation:Scikit learn Scikit learn GaussianProcessRegressor
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Gaussian Processes, Regression |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete implementation of Gaussian Process Regression (GPR) provided by scikit-learn.
Description
The GaussianProcessRegressor implements Gaussian process regression based on Algorithm 2.1 of Rasmussen and Williams (2006). It provides predictions with uncertainty estimates (standard deviations), allows sampling from the posterior distribution via sample_y, and exposes log_marginal_likelihood for external hyperparameter selection. It supports prediction without prior fitting (using the GP prior) and multi-output regression.
Usage
Use Gaussian Process Regression when you need predictions with uncertainty estimates, particularly for small to medium datasets. GPR is well-suited for Bayesian optimization, surrogate modeling, and scenarios where quantifying prediction uncertainty is important.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/gaussian_process/_gpr.py
Signature
class GaussianProcessRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
def __init__(
self,
kernel=None,
*,
alpha=1e-10,
optimizer="fmin_l_bfgs_b",
n_restarts_optimizer=0,
normalize_y=False,
copy_X_train=True,
n_targets=None,
random_state=None,
):
...
def fit(self, X, y):
...
def predict(self, X, return_std=False, return_cov=False):
...
def sample_y(self, X, n_samples=1, random_state=0):
...
def log_marginal_likelihood(self, theta=None, eval_gradient=False, clone_kernel=True):
...
Import
from sklearn.gaussian_process import GaussianProcessRegressor
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | array-like of shape (n_samples, n_features) | Yes | Training input samples |
| y | array-like of shape (n_samples,) or (n_samples, n_targets) | Yes | Target values |
| kernel | Kernel instance | No | Covariance function (default: ConstantKernel * RBF) |
| alpha | float or ndarray | No | Noise level added to diagonal of kernel matrix |
| normalize_y | bool | No | Whether to normalize target values |
| return_std | bool | No | Whether to return standard deviations with predictions |
| return_cov | bool | No | Whether to return full covariance matrix with predictions |
Outputs
| Name | Type | Description |
|---|---|---|
| y_mean | ndarray of shape (n_samples,) | Mean prediction |
| y_std | ndarray of shape (n_samples,) | Standard deviation of prediction (if return_std=True) |
| y_cov | ndarray of shape (n_samples, n_samples) | Covariance matrix (if return_cov=True) |
| kernel_ | Kernel | Optimized kernel with fitted hyperparameters |
Usage Examples
Basic Usage
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
X_train = np.array([[1], [3], [5], [6], [7]])
y_train = np.sin(X_train).ravel()
kernel = C(1.0) * RBF(1.0)
gpr = GaussianProcessRegressor(kernel=kernel, random_state=42)
gpr.fit(X_train, y_train)
X_test = np.linspace(0, 8, 50).reshape(-1, 1)
y_pred, y_std = gpr.predict(X_test, return_std=True)
print(f"Predictions shape: {y_pred.shape}, Std shape: {y_std.shape}")