Implementation:Avhz RustQuant KNearestNeighbors
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Classification, Quantitative_Finance |
| Last Updated | 2026-02-07 19:00 GMT |
Overview
Concrete tool for K-Nearest Neighbors (KNN) classification provided by the RustQuant library.
Description
The KNearestClassifier struct implements the K-Nearest Neighbors algorithm for classification tasks. Given a set of labeled training data points with multiple features, it classifies new data points by finding the k closest training samples (neighbors) and assigning the class label by majority vote among those neighbors.
The implementation supports three distance metrics via the Metric enum:
- Euclidean -- Standard L2 distance (default). Equivalent to Minkowski with p=2.
- Manhattan -- L1 / taxicab distance. Equivalent to Minkowski with p=1.
- Minkowski(p) -- Generalized Lp distance parameterized by an integer power value.
Distance computations are delegated to the nalgebra library's built-in metric_distance and apply_metric_distance methods. Neighbors are sorted by ascending distance, and the top k are selected. Classification for a single point uses a frequency-based majority vote over class labels.
Usage
Use this implementation when you need a non-parametric classification model that makes predictions based on proximity in feature space. KNN is suitable for small-to-medium datasets where the decision boundary is non-linear and you want a simple, interpretable classifier. It works well as a baseline classifier or when the underlying data distribution is unknown.
Code Reference
Source Location
- Repository: RustQuant
- File: crates/RustQuant_ml/src/k_nearest_neighbors.rs
- Lines: 1-326
Signature
#[derive(Clone, Debug)]
pub struct KNearestClassifier<T> {
pub x: DMatrix<T>,
pub y: DVector<T>,
pub metric: Metric,
}
#[derive(Clone, Debug)]
pub enum Metric {
Euclidean,
Manhattan,
Minkowski(i32),
}
impl KNearestClassifier<f64> {
pub fn new(x: DMatrix<f64>, y: DVector<f64>, metric: Metric) -> Self;
pub fn predict(&self, xprime: &DMatrix<f64>, k: &usize) -> Vec<f64>;
}
Import
use RustQuant::ml::{KNearestClassifier, Metric};
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| x | DMatrix<f64> |
Yes | Training data matrix. Rows are data points; columns are features. |
| y | DVector<f64> |
Yes | Class labels for each row in x. Integer labels encoded as f64 values. |
| metric | Metric |
Yes | Distance metric to use: Euclidean, Manhattan, or Minkowski(p). |
| xprime | &DMatrix<f64> |
Yes (predict) | Test data matrix with the same number of columns as the training data. |
| k | &usize |
Yes (predict) | Number of nearest neighbors to consider for classification. |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | Vec<f64> |
Predicted class labels for each row of the test data matrix. |
Usage Examples
use nalgebra::{dmatrix, DVector};
use RustQuant::ml::{KNearestClassifier, Metric};
// Training data: 4 features per sample
let x_train = dmatrix![
5.1, 3.5, 1.4, 0.2;
4.9, 3.0, 1.4, 0.2;
7.0, 3.2, 4.7, 1.4;
6.4, 3.2, 4.5, 1.5
];
// Labels: 0.0 = Setosa, 1.0 = Versicolor
let y_train = DVector::from_vec(vec![0.0, 0.0, 1.0, 1.0]);
// Build KNN classifier with Euclidean distance
let knn = KNearestClassifier::new(x_train, y_train, Metric::Euclidean);
// Test data
let x_test = dmatrix![
5.0, 3.4, 1.5, 0.2;
6.7, 3.1, 4.4, 1.4
];
// Predict with k=3 neighbors
let predictions = knn.predict(&x_test, &3);
// predictions[0] == 0.0 (Setosa)
// predictions[1] == 1.0 (Versicolor)