Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Speechbrain Speechbrain Get Verification Scores

From Leeroopedia


Property Value
Implementation Name Get Verification Scores
Type API Doc
Repository speechbrain/speechbrain
Source File recipes/VoxCeleb/SpeakerRec/speaker_verification_cosine.py:L82-155 (scores), L54-79 (embedding loop)
Import Recipe-specific. Uses torch.nn.CosineSimilarity
Related Principle Principle:Speechbrain_Speechbrain_Speaker_Verification_Scoring

API Signatures

get_verification_scores

def get_verification_scores(veri_test):
    """Computes positive and negative scores given the verification split.

    Arguments
    ---------
    veri_test : list
        List of verification trial strings, each formatted as
        "label enrol_id test_id".

    Returns
    -------
    positive_scores : list
        Cosine similarity scores for same-speaker trials (label=1).
    negative_scores : list
        Cosine similarity scores for different-speaker trials (label=0).
    """

compute_embedding_loop

def compute_embedding_loop(data_loader):
    """Computes the embeddings of all the waveforms specified in the dataloader.

    Arguments
    ---------
    data_loader : DataLoader
        DataLoader yielding batches with .id and .sig attributes.

    Returns
    -------
    embedding_dict : dict
        Dictionary mapping segment IDs (str) to embedding tensors.
    """

Description

These functions implement the speaker verification scoring pipeline. compute_embedding_loop pre-computes embeddings for all utterances in a given DataLoader. get_verification_scores then iterates over a list of verification trial pairs, computes cosine similarity between enrollment and test embeddings, optionally applies score normalization, and returns separate lists of positive (same-speaker) and negative (different-speaker) scores.

Parameters

get_verification_scores

Parameter Type Description
veri_test list of str Verification trial list. Each string has format "label enrol_id test_id" where label is 1 (same speaker) or 0 (different speaker).

The function also depends on module-level variables:

  • enrol_dict: Dictionary of enrollment embeddings (from compute_embedding_loop)
  • test_dict: Dictionary of test embeddings
  • train_dict: Dictionary of training embeddings (for score normalization cohort)
  • params: Hyperparameters dictionary

compute_embedding_loop

Parameter Type Description
data_loader DataLoader A SpeechBrain DataLoader yielding PaddedBatch objects with .id (list of segment IDs) and .sig (waveforms, lengths).

Inputs

  • Verification pairs file: Loaded as a list of strings, one trial per line.
  • Pre-computed embedding dictionaries: Enrollment, test, and (optionally) training embeddings stored in memory.
  • DataLoader objects: For enrollment, test, and (optionally) training data.

Outputs

  • positive_scores (list of float): Cosine similarity scores for target (same-speaker) trials.
  • negative_scores (list of float): Cosine similarity scores for non-target (different-speaker) trials.
  • scores.txt (file): Written to params["output_folder"]/scores.txt with format: enrol_id test_id label score.

Implementation Details

Embedding Loop

def compute_embedding_loop(data_loader):
    embedding_dict = {}
    with torch.no_grad():
        for batch in tqdm(data_loader, dynamic_ncols=True):
            batch = batch.to(run_opts["device"])
            seg_ids = batch.id
            wavs, lens = batch.sig

            # Skip if all segments already computed
            found = False
            for seg_id in seg_ids:
                if seg_id not in embedding_dict:
                    found = True
            if not found:
                continue

            wavs, lens = wavs.to(run_opts["device"]), lens.to(run_opts["device"])
            emb = compute_embedding(wavs, lens).unsqueeze(1)
            for i, seg_id in enumerate(seg_ids):
                embedding_dict[seg_id] = emb[i].detach().clone()
    return embedding_dict

Key behaviors:

  • Processes all batches under torch.no_grad() for efficiency.
  • Checks whether all segment IDs in a batch are already computed, skipping redundant computation.
  • Each embedding is detached and cloned to prevent memory leaks from the computation graph.

Scoring with Cosine Similarity

similarity = torch.nn.CosineSimilarity(dim=-1, eps=1e-6)

for i, line in enumerate(veri_test):
    lab_pair = int(line.split(" ")[0].rstrip().split(".")[0].strip())
    enrol_id = line.split(" ")[1].rstrip().split(".")[0].strip()
    test_id = line.split(" ")[2].rstrip().split(".")[0].strip()

    enrol = enrol_dict[enrol_id]
    test = test_dict[test_id]

    score = similarity(enrol, test)[0]

    if lab_pair == 1:
        positive_scores.append(score)
    else:
        negative_scores.append(score)

Score Normalization

When "score_norm" is set in params, the function applies normalization using a training cohort:

# Z-norm: normalize by enrollment impostor statistics
if params["score_norm"] == "z-norm":
    enrol_rep = enrol.repeat(train_cohort.shape[0], 1, 1)
    score_e_c = similarity(enrol_rep, train_cohort)
    if "cohort_size" in params:
        score_e_c = torch.topk(score_e_c, k=params["cohort_size"], dim=0)[0]
    mean_e_c = torch.mean(score_e_c, dim=0)
    std_e_c = torch.std(score_e_c, dim=0)
    score = (score - mean_e_c) / std_e_c

# T-norm: normalize by test impostor statistics
elif params["score_norm"] == "t-norm":
    score = (score - mean_t_c) / std_t_c

# S-norm: symmetric (average of z-norm and t-norm)
elif params["score_norm"] == "s-norm":
    score_e = (score - mean_e_c) / std_e_c
    score_t = (score - mean_t_c) / std_t_c
    score = 0.5 * (score_e + score_t)

Usage Example

import os
import sys
import torch
import speechbrain as sb
from hyperpyyaml import load_hyperpyyaml
from speechbrain.utils.metric_stats import EER, minDCF

# Load params and pretrained model
params_file, run_opts, overrides = sb.core.parse_arguments(sys.argv[1:])
with open(params_file) as fin:
    params = load_hyperpyyaml(fin, overrides)

params["pretrainer"].collect_files()
params["pretrainer"].load_collected()
params["embedding_model"].eval()

# Create dataloaders
train_dataloader, enrol_dataloader, test_dataloader = dataio_prep(params)

# Compute embeddings
enrol_dict = compute_embedding_loop(enrol_dataloader)
test_dict = compute_embedding_loop(test_dataloader)

# Load verification trials
with open(veri_file_path) as f:
    veri_test = [line.rstrip() for line in f]

# Compute scores
positive_scores, negative_scores = get_verification_scores(veri_test)

# Evaluate
eer, th = EER(torch.tensor(positive_scores), torch.tensor(negative_scores))
min_dcf, th = minDCF(torch.tensor(positive_scores), torch.tensor(negative_scores))
print(f"EER: {eer * 100:.2f}%")
print(f"minDCF: {min_dcf:.4f}")

See Also

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment