Implementation:FMInference FlexLLMGen Compute Metrics

Metadata

Field	Value
Sources	FlexLLMGen\|https://github.com/FMInference/FlexLLMGen
Domains	Evaluation, Metrics
Last updated	2026-02-09 00:00 GMT

Overview

Concrete tool for computing classification metrics on data wrangling task predictions provided by the FlexLLMGen data wrangling application.

Description

compute_metrics() takes prediction and gold label lists plus task name. It iterates over pairs, normalizes to lowercase, applies task-specific matching (exact for entity_matching/data_imputation, startswith for schema_matching/error_detection_spelling, endswith for error_detection), counts TP/TN/FP/FN, and returns (precision, recall, accuracy, f1) as a tuple.

Usage

Call after model generation to evaluate predictions against ground truth labels.

Code Reference

Source: flexllmgen/apps/data_wrangle/utils/utils.py, Lines: 25-63
Signature:

def compute_metrics(preds: List, golds: List, task: str):
    """Compute metrics.

    Args:
        preds: List of predicted label strings
        golds: List of ground truth label strings
        task: Task name - one of "entity_matching", "data_imputation",
              "error_detection", "error_detection_spelling", "schema_matching"
    Returns:
        Tuple of (precision: float, recall: float, accuracy: float, f1: float)
    """

Import:

from flexllmgen.apps.data_wrangle.utils.utils import compute_metrics

I/O Contract

Inputs

Name	Type	Required	Description
preds	List[str]	Yes	Predicted label strings
golds	List[str]	Yes	Ground truth label strings
task	str	Yes	Task name for matching strategy

Outputs

Tuple[float, float, float, float] — (precision, recall, accuracy, f1).

Usage Examples

from flexllmgen.apps.data_wrangle.utils.utils import compute_metrics

predictions = ["yes", "no", "yes", "no", "yes"]
ground_truth = ["yes", "no", "no", "no", "yes"]

prec, rec, acc, f1 = compute_metrics(predictions, ground_truth, task="entity_matching")
print(f"Precision: {prec:.3f}, Recall: {rec:.3f}, Accuracy: {acc:.3f}, F1: {f1:.3f}")
# Precision: 0.667, Recall: 1.000, Accuracy: 0.800, F1: 0.800

Related Pages

Principle:FMInference_FlexLLMGen_Prediction_Evaluation_Metrics

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment