Implementation:Haifengl Smile KNN RNN Search API
Overview
This API Doc covers the KNNSearch and RNNSearch interfaces, which define the query API for all spatial index structures in Smile's smile.neighbor package. These interfaces provide the search() and nearest() methods that execute nearest neighbor queries against pre-built indexes.
Import
import smile.neighbor.KNNSearch;
import smile.neighbor.RNNSearch;
import smile.neighbor.Neighbor;
import java.util.List;
import java.util.ArrayList;
KNNSearch Interface
Source: KNNSearch.java (Lines 30-54)
Signature
public interface KNNSearch<K, V> {
/**
* Returns the nearest neighbor. Excludes the query object itself
* (by reference equality) from the results.
*
* @param q the query key.
* @return the nearest neighbor.
*/
default Neighbor<K, V> nearest(K q) {
return search(q, 1)[0];
}
/**
* Retrieves the k nearest neighbors to the query key.
*
* @param q the query key.
* @param k the number of nearest neighbors to search for.
* @return the k nearest neighbors, sorted by distance (ascending).
* @throws IllegalArgumentException if k <= 0 or k > dataset size.
*/
Neighbor<K, V>[] search(K q, int k);
}
Parameters
| Parameter | Type | Description |
|---|---|---|
q |
K |
The query point. For Euclidean structures, this is double[]. For generic structures, it matches the key type.
|
k |
int |
The number of nearest neighbors to return. Must be > 0 and <= dataset size. |
Return Value
Neighbor<K, V>[] -- An array of Neighbor records sorted by distance in ascending order (nearest first). Each element contains:
key-- the neighbor's key (e.g., its coordinate vector)value-- the neighbor's associated data objectindex-- the neighbor's position in the original datasetdistance-- the distance from the query to this neighbor
Example: k-Nearest Neighbor Search
import smile.neighbor.KDTree;
import smile.neighbor.Neighbor;
import smile.math.MathEx;
// Build index
double[][] data = MathEx.randn(10000, 5);
KDTree<double[]> tree = KDTree.of(data);
// Query: find 10 nearest neighbors
double[] query = {1.0, 2.0, 3.0, 4.0, 5.0};
Neighbor<double[], double[]>[] neighbors = tree.search(query, 10);
// Results are sorted by distance (ascending)
for (Neighbor<double[], double[]> n : neighbors) {
System.out.printf("Index: %d, Distance: %.4f%n", n.index(), n.distance());
}
Example: Single Nearest Neighbor
// Find the single closest point
Neighbor<double[], double[]> nearest = tree.nearest(query);
System.out.printf("Nearest: index=%d, distance=%.4f%n",
nearest.index(), nearest.distance());
RNNSearch Interface
Source: RNNSearch.java (Lines 31-41)
Signature
public interface RNNSearch<K, V> {
/**
* Retrieves the neighbors in a fixed radius of query object,
* i.e. d(q, v) <= radius.
*
* @param q the query key.
* @param radius the radius of search range from target. Must be > 0.
* @param neighbors the list to store found neighbors on output.
* @throws IllegalArgumentException if radius <= 0.
*/
void search(K q, double radius, List<Neighbor<K, V>> neighbors);
}
Parameters
| Parameter | Type | Description |
|---|---|---|
q |
K |
The query point. |
radius |
double |
The search radius. All points with distance <= radius from q are returned. Must be > 0. For BKTree, must be a positive integer.
|
neighbors |
List<Neighbor<K, V>> |
An output list that the method appends results to. The caller must provide this list (typically new ArrayList<>()). The list is not cleared by the method -- results are appended.
|
Return Value
void -- Results are written to the neighbors list parameter. The list is not sorted; the caller should sort if order is needed.
Example: Range Search
import smile.neighbor.KDTree;
import smile.neighbor.Neighbor;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
// Build index
double[][] data = loadData();
KDTree<double[]> tree = KDTree.of(data);
// Range search: find all points within distance 2.5
double[] query = {1.0, 2.0, 3.0};
List<Neighbor<double[], double[]>> results = new ArrayList<>();
tree.search(query, 2.5, results);
// Sort by distance if needed (results are unsorted by default)
Collections.sort(results);
System.out.printf("Found %d neighbors within radius 2.5%n", results.size());
for (Neighbor<double[], double[]> n : results) {
System.out.printf(" Index: %d, Distance: %.4f%n", n.index(), n.distance());
}
Example: Reusing the Result List
// Important: clear the list between queries since search() appends
List<Neighbor<double[], double[]>> neighbors = new ArrayList<>();
for (double[] q : queryPoints) {
tree.search(q, 1.0, neighbors);
processNeighbors(neighbors);
neighbors.clear(); // Must clear before next query
}
Combined KNN + RNN Example
Many structures implement both interfaces, allowing both query types on the same index:
import smile.neighbor.KDTree;
import smile.neighbor.Neighbor;
import java.util.ArrayList;
import java.util.List;
double[][] data = loadData();
KDTree<double[]> tree = KDTree.of(data);
double[] query = data[42]; // query with a point from the dataset
// KNN search -- the query point (data[42]) is automatically excluded
Neighbor<double[], double[]>[] knnResults = tree.search(query, 5);
// knnResults will NOT contain data[42] itself
// RNN search -- same exclusion
List<Neighbor<double[], double[]>> rnnResults = new ArrayList<>();
tree.search(query, 3.0, rnnResults);
// rnnResults will NOT contain data[42] itself
MPLSH Extended Search API
MPLSH provides additional search methods with recall and probe parameters:
import smile.neighbor.MPLSH;
import smile.neighbor.Neighbor;
import java.util.ArrayList;
import java.util.List;
// After building and fitting MPLSH...
MPLSH<double[]> mplsh = buildAndFitMPLSH(data);
// Search with recall target and max probes
double recall = 0.95; // target 95% recall
int maxProbes = 100; // maximum number of bucket probes
// KNN with recall/probe parameters
Neighbor<double[], double[]>[] results = mplsh.search(query, 10, recall, maxProbes);
// Nearest with recall/probe parameters
Neighbor<double[], double[]> nearest = mplsh.nearest(query, recall, maxProbes);
// RNN with recall/probe parameters
List<Neighbor<double[], double[]>> rangeResults = new ArrayList<>();
mplsh.search(query, 5.0, rangeResults, recall, maxProbes);
Interface Implementation by Structure
| Structure | nearest(K q) |
search(K q, int k) |
search(K q, double r, List)
|
|---|---|---|---|
| KDTree | Overridden (optimized single-neighbor) | Heap-based branch-and-bound | Recursive branch-and-bound |
| CoverTree | Default (calls search(q,1)) | Top-down traversal with bound pruning | Top-down traversal with radius bound |
| LSH | Overridden (candidate set scan) | Candidate set + heap selection | Candidate set + radius filter |
| MPLSH | Overridden (multi-probe + recall) | Multi-probe candidate set + heap | Multi-probe candidate set + radius filter |
| RPForest | Default (calls search(q,1)) | Union leaf candidates + heap selection | Not implemented |
| BKTree | Not implemented | Not implemented | Triangle inequality pruning on discrete metric |
| LinearSearch | Overridden (parallel distance + argmin) | Parallel distance + heap selection | Parallel distance + threshold filter |
Source
- KNNSearch.java (Lines 42-53)
- RNNSearch.java (Line 40)
- Source: base/src/main/java/smile/neighbor/