Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Rapidsai Cuml Genetic API

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Genetic_Programming
Last Updated 2026-02-08 12:00 GMT

Overview

Provides the GPU-accelerated genetic programming API in cuML for symbolic regression, classification, and transformation, with functions for fitting, predicting, and transforming data using evolved program trees.

Description

The genetic.h header declares the high-level genetic programming API in the cuml::genetic namespace. It provides functions for the complete lifecycle of symbolic machine learning:

Utility:

  • stringify: Converts a program (AST) to a human-readable string representation for visualization and debugging.

Training:

  • symFit: Fits a symbolic model (regressor, classifier, or transformer) to a given dataset. Evolves a population of programs over multiple generations using tournament selection, crossover, and mutation. Outputs the final generation of programs (sorted by fitness) and optionally the full generational history. Note: device memory allocated for program nodes must be freed by the caller after prediction.

Prediction:

  • symRegPredict: Makes continuous predictions using a trained symbolic regressor.
  • symClfPredictProbs: Computes class probabilities for a symbolic classifier, optionally applying a transformer (e.g., sigmoid).
  • symClfPredict: Returns binary class predictions from a symbolic classifier's decision boundary.

Transformation:

  • symTransform: Transforms input features using a set of evolved programs, generating new engineered features.

All functions operate on device memory and accept a RAFT handle for GPU resource management.

Usage

Use the genetic programming API for automatic feature engineering (transformation), interpretable regression (symbolic regression), or interpretable classification (symbolic classification). The evolved programs are human-readable mathematical expressions, making them suitable when model interpretability is important. The GPU acceleration enables practical evolution of large populations.

Code Reference

Source Location

  • Repository: Rapidsai_Cuml
  • File: cpp/include/cuml/genetic/genetic.h

Signature

namespace cuml {
namespace genetic {

std::string stringify(const program& prog);

void symFit(const raft::handle_t& handle,
            const float* input,
            const float* labels,
            const float* sample_weights,
            const int n_rows,
            const int n_cols,
            param& params,
            program_t& final_progs,
            std::vector<std::vector<program>>& history);

void symRegPredict(const raft::handle_t& handle,
                   const float* input,
                   const int n_rows,
                   const program_t& best_prog,
                   float* output);

void symClfPredictProbs(const raft::handle_t& handle,
                        const float* input,
                        const int n_rows,
                        const param& params,
                        const program_t& best_prog,
                        float* output);

void symClfPredict(const raft::handle_t& handle,
                   const float* input,
                   const int n_rows,
                   const param& params,
                   const program_t& best_prog,
                   float* output);

void symTransform(const raft::handle_t& handle,
                  const float* input,
                  const param& params,
                  const program_t& final_progs,
                  const int n_rows,
                  const int n_cols,
                  float* output);

} // namespace genetic
} // namespace cuml

Import

#include <cuml/genetic/genetic.h>

I/O Contract

Inputs

symFit

Name Type Required Description
handle const raft::handle_t& Yes cuML handle for GPU resources
input const float* Yes Device pointer to feature matrix [n_rows x n_cols]
labels const float* Yes Device pointer to labels [n_rows]
sample_weights const float* No Device pointer to sample weights [n_rows], or nullptr
n_rows int Yes Number of training samples
n_cols int Yes Number of features
params param& Yes Hyperparameters for evolution (population size, generations, etc.)

symRegPredict

Name Type Required Description
handle const raft::handle_t& Yes cuML handle
input const float* Yes Device pointer to feature matrix [n_rows x n_cols]
n_rows int Yes Number of samples
best_prog const program_t& Yes Device pointer to the best trained program

symTransform

Name Type Required Description
handle const raft::handle_t& Yes cuML handle
input const float* Yes Device pointer to feature matrix [n_rows x n_cols]
params const param& Yes Training hyperparameters
final_progs const program_t& Yes Device pointer to the evolved programs
n_rows int Yes Number of samples
n_cols int Yes Number of input features

Outputs

Name Type Description
final_progs (fit) program_t& Device pointer to the final generation of programs, sorted by decreasing fitness
history (fit) std::vector<std::vector<program>>& Host vector of all programs across all generations
output (predict) float* Device array of predictions [n_rows]
output (predict_probs) float* Device array of class probabilities [n_rows], col-major
output (transform) float* Device array of transformed features
stringify std::string Human-readable string representation of a program AST

Usage Examples

#include <cuml/genetic/genetic.h>
#include <cuml/genetic/common.h>

raft::handle_t handle;

int n_rows = 5000;
int n_cols = 10;

float* d_X;       // device [n_rows x n_cols]
float* d_y;       // device [n_rows]

// Configure parameters
cuml::genetic::param params;
params.population_size = 500;
params.generations = 30;
params.metric = cuml::genetic::metric_t::mse;
params.num_features = n_cols;
params.random_state = 42;

// Fit symbolic regressor
cuml::genetic::program_t final_progs;
std::vector<std::vector<cuml::genetic::program>> history;

cuml::genetic::symFit(handle, d_X, d_y, nullptr,
                      n_rows, n_cols, params,
                      final_progs, history);

// Predict with the best program
float* d_output;  // device [n_rows]
cuml::genetic::symRegPredict(handle, d_X, n_rows, final_progs, d_output);

// Visualize the best program
// (after copying from device to host)
// std::string expr = cuml::genetic::stringify(host_prog);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment