Principle:Avhz RustQuant Machine Learning
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Statistics |
| Last Updated | 2026-02-07 21:00 GMT |
Overview
Machine learning algorithms for regression, classification, and neural network components, built on the nalgebra linear algebra library.
Description
Machine Learning in RustQuant provides a suite of supervised learning algorithms and supporting activation functions, all implemented using nalgebra's DMatrix and DVector types for efficient linear algebra.
Linear Regression fits an ordinary least squares model via three selectable decomposition methods: naive matrix inversion (None), QR decomposition (generally fastest), or SVD decomposition (generally most numerically stable). The model automatically prepends a column of ones to the design matrix to fit an intercept term. The output provides both coefficients and a predict() method for out-of-sample evaluation.
Ridge Regression extends linear regression with L2 regularization, adding a penalty term to the loss function. The regularization matrix is an identity matrix scaled by the lambda parameter, with the intercept excluded from penalization when fit_intercept is true.
Lasso Regression implements L1-regularized regression using coordinate descent. The algorithm iteratively applies the soft-thresholding operator to each coefficient, driving small coefficients to exactly zero for feature selection. Convergence is controlled by max_iter and tolerance parameters.
Logistic Regression performs binary classification using the Iteratively Reweighted Least Squares (IRLS) algorithm. At each iteration it computes predicted probabilities via the logistic function, forms a diagonal weight matrix, and solves the weighted normal equations using LU decomposition. The output supports predict() for class labels, predict_proba() for probabilities, and scoring via misclassification rate or cross-entropy.
K-Nearest Neighbors is a non-parametric classifier that predicts the class of a test point by majority vote among its k closest training points. Distance metrics include Euclidean (L2), Manhattan (L1), and general Minkowski (Lp).
Activation Functions are defined as a trait with implementations for f64, DVector<f64>, and autodiff Variable types. Available functions include sigmoid, logistic, ReLU, GELU, tanh, softplus, and Gaussian.
Usage
Use linear regression for simple predictive modeling with continuous targets. Choose ridge or lasso regression when regularization is needed to prevent overfitting or to perform feature selection. Use logistic regression for binary classification tasks. Use KNN for classification when a non-parametric approach is preferred and the dataset is moderately sized. Activation functions are building blocks for neural network layers and are also used internally by the logistic regression implementation.
Theoretical Basis
Linear regression minimizes the residual sum of squares:
Ridge regression adds an L2 penalty:
Lasso regression minimizes with an L1 penalty, solved via coordinate descent with the soft-thresholding operator:
Logistic regression via IRLS iteratively solves:
where and .