Implementation:Dotnet Machinelearning FastTree LambdaMART Derivatives
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Learning to Rank, Information Retrieval |
| Last Updated | 2026-02-09 12:00 GMT |
Overview
LambdaMART ranking gradient computation for pairwise document ranking, implementing the core lambda gradient and weight derivation used to train gradient boosted trees for NDCG optimization.
Description
The getderivatives.cpp module (158 lines) implements the pairwise lambda gradient computation at the heart of the LambdaMART learning-to-rank algorithm. For each query, LambdaMART considers all pairs of documents with different relevance labels and computes a lambda gradient that reflects how much the model score of each document should increase or decrease to improve the ranking metric (NDCG).
The function iterates over document pairs within a query, computing:
- The score difference between each pair of documents.
- A sigmoid transformation of the score difference, looked up from a pre-computed table for efficiency.
- The NDCG discount delta: the change in NDCG that would result from swapping the two documents' positions. This involves the discount factors (typically 1/log2(rank+1)) and the gain values derived from relevance labels.
- The lambda gradient for each document in the pair, weighted by the NDCG delta and the sigmoid.
- The Newton step weight (second derivative approximation), used by the tree learner to determine leaf values.
The implementation supports several advanced features:
- Secondary metrics: A secondary ranking metric can be blended with the primary NDCG objective via the secondaryMetricShare parameter, enabling multi-objective ranking optimization.
- Risk-aware ranking: The alphaRisk and baselineVersusCurrentDcg parameters implement risk-sensitive ranking where the lambda gradients are adjusted based on how the current model compares to a baseline, penalizing degradation more heavily than rewarding improvement.
- Distance weighting: The distanceWeight2 parameter applies position-dependent weighting to the lambda gradients based on the rank distance between paired documents.
- Sigmoid table lookup: Rather than computing the sigmoid function on every pair, a pre-computed lookup table indexed by quantized score differences provides fast approximate evaluation.
Usage
This function is called once per query during each boosting iteration of LambdaMART training. The managed C# layer organizes documents by query, computes current model scores, and passes document blocks to this native function. The resulting lambda gradients and weights are then used as the target values and weights for the next tree in the ensemble, replacing the standard gradient/Hessian computation used in regression boosting.
Code Reference
Source Location
- Repository: Dotnet_Machinelearning
- File: src/Native/FastTreeNative/getderivatives.cpp
- Lines: 1-158
Signature
extern "C" {
EXPORT_API(void) C_GetDerivatives(
int numDocuments,
int begin,
int *pPermutation,
short *pLabels,
double *pScores,
double *pLambdas,
double *pWeights,
double *pDiscount,
double inverseMaxDCG,
double *pGainLabels,
double secondaryMetricShare,
bool secondaryExclusive,
double secondaryInverseMaxDCG,
double *pSecondaryGains,
double *sigmoidTable,
double minScore,
double maxScore,
int sigmoidTableLength,
double scoreToSigmoidTableFactor,
char costFunctionParam,
double distanceWeight2,
int numActualDocuments,
double *pLambdaSum,
double minDoubleValue,
double alphaRisk,
double baselineVersusCurrentDcg
);
}
Import
// P/Invoke declaration from managed ML.NET LambdaMART trainer
[DllImport("FastTreeNative")]
private static extern unsafe void C_GetDerivatives(
int numDocuments,
int begin,
int* pPermutation,
short* pLabels,
double* pScores,
double* pLambdas,
double* pWeights,
double* pDiscount,
double inverseMaxDCG,
double* pGainLabels,
double secondaryMetricShare,
bool secondaryExclusive,
double secondaryInverseMaxDCG,
double* pSecondaryGains,
double* sigmoidTable,
double minScore,
double maxScore,
int sigmoidTableLength,
double scoreToSigmoidTableFactor,
byte costFunctionParam,
double distanceWeight2,
int numActualDocuments,
double* pLambdaSum,
double minDoubleValue,
double alphaRisk,
double baselineVersusCurrentDcg);
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| numDocuments | int | Yes | Number of documents in this query group. |
| begin | int | Yes | Starting index of this query's documents within the global arrays. |
| pPermutation | int* | Yes | Permutation array mapping local document positions to global indices, ordered by current model score (descending). |
| pLabels | short* | Yes | Relevance labels for each document (e.g., 0-4 scale for web search relevance). |
| pScores | double* | Yes | Current model scores for each document. |
| pLambdas | double* | Yes (in/out) | Output array for lambda gradients. Accumulated across document pairs. |
| pWeights | double* | Yes (in/out) | Output array for Newton step weights (second derivative approximation). |
| pDiscount | double* | Yes | Position discount factors, typically 1/log2(rank+1) for NDCG. |
| inverseMaxDCG | double | Yes | 1.0 / idealDCG for this query. Normalizes the NDCG delta computation. |
| pGainLabels | double* | Yes | Precomputed gain values per relevance label (e.g., 2^label - 1). |
| secondaryMetricShare | double | No | Blending weight for the secondary ranking metric (0.0 to 1.0). 0.0 disables. |
| secondaryExclusive | bool | No | If true, secondary metric replaces primary rather than blending. |
| secondaryInverseMaxDCG | double | No | Inverse ideal DCG for the secondary metric. |
| pSecondaryGains | double* | No | Gain values for the secondary metric. NULL if unused. |
| sigmoidTable | double* | Yes | Pre-computed sigmoid lookup table for fast pairwise probability computation. |
| minScore | double | Yes | Minimum score in the sigmoid table range. |
| maxScore | double | Yes | Maximum score in the sigmoid table range. |
| sigmoidTableLength | int | Yes | Length of the sigmoid lookup table. |
| scoreToSigmoidTableFactor | double | Yes | Scaling factor to convert score differences to sigmoid table indices. |
| costFunctionParam | char | Yes | Cost function selector controlling the lambda computation variant. |
| distanceWeight2 | double | Yes | Exponent for position-distance weighting between document pairs. |
| numActualDocuments | int | Yes | Number of documents to actually process (may be less than numDocuments for truncation). |
| pLambdaSum | double* | Yes (out) | Output: sum of all lambda values for this query (used for diagnostics). |
| minDoubleValue | double | Yes | Minimum double value to prevent underflow in lambda computation. |
| alphaRisk | double | No | Risk sensitivity parameter. 0.0 for risk-neutral ranking. |
| baselineVersusCurrentDcg | double | No | DCG difference between baseline and current model, used for risk-aware adjustment. |
Outputs
| Name | Type | Description |
|---|---|---|
| pLambdas | double* | Per-document lambda gradients. Positive values indicate the document score should increase; negative values indicate it should decrease. |
| pWeights | double* | Per-document Newton step weights. Used by the tree learner to compute optimal leaf values via weighted least squares. |
| pLambdaSum | double* | Scalar sum of absolute lambda values for the query. Used to monitor training convergence. |
Usage Examples
// Compute LambdaMART gradients for a single query with 50 documents
int numDocs = 50;
int begin = 0;
int permutation[50]; // sorted by descending score
short labels[50]; // relevance labels (0-4)
double scores[50]; // current model scores
double lambdas[50] = {}; // zero-initialized output
double weights[50] = {}; // zero-initialized output
double discount[50]; // position discounts: 1/log2(rank+1)
double gainLabels[5]; // gain per label: {0, 1, 3, 7, 15}
double lambdaSum = 0.0;
// Pre-compute discounts
for (int i = 0; i < numDocs; i++) {
discount[i] = 1.0 / log2(i + 2.0);
}
// Pre-compute gain table
for (int l = 0; l < 5; l++) {
gainLabels[l] = pow(2.0, l) - 1.0;
}
// Sigmoid table for fast lookup
int sigmoidLen = 10000;
double sigmoidTable[10000];
double minS = -10.0, maxS = 10.0;
double factor = sigmoidLen / (maxS - minS);
for (int i = 0; i < sigmoidLen; i++) {
double x = minS + i / factor;
sigmoidTable[i] = 1.0 / (1.0 + exp(-x));
}
C_GetDerivatives(
numDocs, begin, permutation, labels, scores,
lambdas, weights, discount, inverseMaxDCG,
gainLabels,
0.0, // secondaryMetricShare (disabled)
false, // secondaryExclusive
0.0, // secondaryInverseMaxDCG
nullptr, // pSecondaryGains
sigmoidTable, minS, maxS, sigmoidLen, factor,
'c', // costFunctionParam
0.0, // distanceWeight2
numDocs, // numActualDocuments
&lambdaSum,
1e-300, // minDoubleValue
0.0, // alphaRisk (risk-neutral)
0.0 // baselineVersusCurrentDcg
);
// lambdas[] and weights[] now contain the per-document gradients
// for the next tree in the LambdaMART ensemble.