Implementation:Rapidsai Cuml Genetic Program
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Genetic_Programming |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Defines the program struct (AST representation) and provides low-level GPU-accelerated functions for executing, evaluating, building, and mutating genetic programs in cuML's symbolic regression framework.
Description
The program.h header in the cuml::genetic namespace defines the core program struct and the operational functions for genetic programming:
program Struct:
The main data structure representing a mathematical expression as an Abstract Syntax Tree (AST). The AST is stored in a flattened 1D array in the reverse of DFS-right-child-first order. Key members:
nodes: Pointer to the node array (not owned by the struct; assumed to be pinned memory).len: Total number of nodes.depth: Maximum depth of the AST.raw_fitness_: Fitness score.metric: Fitness metric used.mut_type: Mutation type that produced this program.
program_t: Typedef alias for program* (device programs).
Execution and Evaluation:
execute: Evaluates all programs on a dataset in a batched GPU kernel.compute_metric: Computes the loss/fitness metric for all programs in one batch.find_fitness/find_batched_fitness: Computes fitness scores for one or all programs.set_fitness/set_batched_fitness: Computes and stores fitness in both device and host program objects.get_fitness: Returns the host fitness score accounting for parsimony penalty.get_depth: Computes the depth of a program.
Construction and Mutation:
build_program: Builds a random program with maximum depth 10.point_mutation: Replaces random nodes in-place.crossover: Performs hoisted crossover between a parent and donor program.subtree_mutation: Performs crossover with a randomly generated program.hoist_mutation: Replaces a subtree with a subtree of that subtree.
Usage
Use these types and functions when implementing or extending the genetic programming evolution loop. The program struct is the central data type, while the execution and mutation functions handle the core evolutionary operations on the GPU. Higher-level operations like symFit in genetic.h orchestrate these low-level functions.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
cpp/include/cuml/genetic/program.h
Signature
namespace cuml {
namespace genetic {
struct program {
explicit program();
~program();
explicit program(const program& src);
program& operator=(const program& src);
node* nodes;
int len;
int depth;
float raw_fitness_;
metric_t metric;
mutation_t mut_type;
};
typedef program* program_t;
void execute(const raft::handle_t& h, const program_t& d_progs,
const int n_rows, const int n_progs,
const float* data, float* y_pred);
void compute_metric(const raft::handle_t& h, int n_rows, int n_progs,
const float* y, const float* y_pred,
const float* w, float* score, const param& params);
void find_fitness(const raft::handle_t& h, program_t& d_prog,
float* score, const param& params, const int n_rows,
const float* data, const float* y, const float* sample_weights);
void find_batched_fitness(const raft::handle_t& h, int n_progs,
program_t& d_progs, float* score,
const param& params, const int n_rows,
const float* data, const float* y,
const float* sample_weights);
void set_fitness(const raft::handle_t& h, program_t& d_prog,
program& h_prog, const param& params, const int n_rows,
const float* data, const float* y, const float* sample_weights);
void set_batched_fitness(const raft::handle_t& h, int n_progs,
program_t& d_progs, std::vector<program>& h_progs,
const param& params, const int n_rows,
const float* data, const float* y,
const float* sample_weights);
float get_fitness(const program& prog, const param& params);
int get_depth(const program& p_out);
void build_program(program& p_out, const param& params, std::mt19937& rng);
void point_mutation(const program& prog, program& p_out,
const param& params, std::mt19937& rng);
void crossover(const program& prog, const program& donor,
program& p_out, const param& params, std::mt19937& rng);
void subtree_mutation(const program& prog, program& p_out,
const param& params, std::mt19937& rng);
void hoist_mutation(const program& prog, program& p_out,
const param& params, std::mt19937& rng);
} // namespace genetic
} // namespace cuml
Import
#include <cuml/genetic/program.h>
I/O Contract
Inputs
execute
| Name | Type | Required | Description |
|---|---|---|---|
| h | const raft::handle_t& | Yes | cuML handle |
| d_progs | const program_t& | Yes | Device pointer to programs |
| n_rows | int | Yes | Number of rows in the input dataset |
| n_progs | int | Yes | Number of programs to evaluate |
| data | const float* | Yes | Device pointer to input features [n_rows x n_cols], col-major |
build_program
| Name | Type | Required | Description |
|---|---|---|---|
| params | const param& | Yes | Training hyperparameters controlling tree structure |
| rng | std::mt19937& | Yes | Random number generator for node selection |
crossover
| Name | Type | Required | Description |
|---|---|---|---|
| prog | const program& | Yes | Parent program |
| donor | const program& | Yes | Donor program for crossover |
| params | const param& | Yes | Training hyperparameters |
| rng | std::mt19937& | Yes | Random number generator |
Outputs
| Name | Type | Description |
|---|---|---|
| y_pred (execute) | float* | Device array of program outputs [n_rows x n_progs] |
| score (compute_metric) | float* | Device array of fitness scores [n_progs] |
| p_out (build_program) | program& | Newly built random program |
| p_out (mutations) | program& | Mutated program result |
| get_fitness | float | Fitness score with parsimony penalty applied |
| get_depth | int | Depth of the program AST |
Usage Examples
#include <cuml/genetic/program.h>
#include <cuml/genetic/common.h>
#include <random>
// Build a random program
cuml::genetic::param params;
params.num_features = 5;
params.init_depth[0] = 2;
params.init_depth[1] = 6;
params.init_method = cuml::genetic::init_method_t::half_and_half;
std::mt19937 rng(42);
cuml::genetic::program prog;
cuml::genetic::build_program(prog, params, rng);
// Get depth
int depth = cuml::genetic::get_depth(prog);
// Mutate the program
cuml::genetic::program mutated;
cuml::genetic::point_mutation(prog, mutated, params, rng);
// Crossover two programs
cuml::genetic::program child;
cuml::genetic::program donor;
cuml::genetic::build_program(donor, params, rng);
cuml::genetic::crossover(prog, donor, child, params, rng);
// Evaluate on GPU (batch execution)
raft::handle_t handle;
// cuml::genetic::execute(handle, d_progs, n_rows, n_progs, d_data, d_y_pred);