Implementation:Speechbrain Speechbrain Get Verification Scores

Property	Value
Implementation Name	Get Verification Scores
Type	API Doc
Repository	speechbrain/speechbrain
Source File	`recipes/VoxCeleb/SpeakerRec/speaker_verification_cosine.py:L82-155` (scores), `L54-79` (embedding loop)
Import	Recipe-specific. Uses `torch.nn.CosineSimilarity`
Related Principle	Principle:Speechbrain_Speechbrain_Speaker_Verification_Scoring

API Signatures

get_verification_scores

def get_verification_scores(veri_test):
    """Computes positive and negative scores given the verification split.

    Arguments
    ---------
    veri_test : list
        List of verification trial strings, each formatted as
        "label enrol_id test_id".

    Returns
    -------
    positive_scores : list
        Cosine similarity scores for same-speaker trials (label=1).
    negative_scores : list
        Cosine similarity scores for different-speaker trials (label=0).
    """

compute_embedding_loop

def compute_embedding_loop(data_loader):
    """Computes the embeddings of all the waveforms specified in the dataloader.

    Arguments
    ---------
    data_loader : DataLoader
        DataLoader yielding batches with .id and .sig attributes.

    Returns
    -------
    embedding_dict : dict
        Dictionary mapping segment IDs (str) to embedding tensors.
    """

Description

These functions implement the speaker verification scoring pipeline. compute_embedding_loop pre-computes embeddings for all utterances in a given DataLoader. get_verification_scores then iterates over a list of verification trial pairs, computes cosine similarity between enrollment and test embeddings, optionally applies score normalization, and returns separate lists of positive (same-speaker) and negative (different-speaker) scores.

Parameters

get_verification_scores

Parameter	Type	Description
veri_test	list of str	Verification trial list. Each string has format `"label enrol_id test_id"` where label is 1 (same speaker) or 0 (different speaker).

The function also depends on module-level variables:

enrol_dict: Dictionary of enrollment embeddings (from compute_embedding_loop)
test_dict: Dictionary of test embeddings
train_dict: Dictionary of training embeddings (for score normalization cohort)
params: Hyperparameters dictionary

compute_embedding_loop

Parameter	Type	Description
data_loader	`DataLoader`	A SpeechBrain DataLoader yielding `PaddedBatch` objects with `.id` (list of segment IDs) and `.sig` (waveforms, lengths).

Inputs

Verification pairs file: Loaded as a list of strings, one trial per line.
Pre-computed embedding dictionaries: Enrollment, test, and (optionally) training embeddings stored in memory.
DataLoader objects: For enrollment, test, and (optionally) training data.

Outputs

positive_scores (list of float): Cosine similarity scores for target (same-speaker) trials.
negative_scores (list of float): Cosine similarity scores for non-target (different-speaker) trials.
scores.txt (file): Written to params["output_folder"]/scores.txt with format: enrol_id test_id label score.

Implementation Details

Embedding Loop

def compute_embedding_loop(data_loader):
    embedding_dict = {}
    with torch.no_grad():
        for batch in tqdm(data_loader, dynamic_ncols=True):
            batch = batch.to(run_opts["device"])
            seg_ids = batch.id
            wavs, lens = batch.sig

            # Skip if all segments already computed
            found = False
            for seg_id in seg_ids:
                if seg_id not in embedding_dict:
                    found = True
            if not found:
                continue

            wavs, lens = wavs.to(run_opts["device"]), lens.to(run_opts["device"])
            emb = compute_embedding(wavs, lens).unsqueeze(1)
            for i, seg_id in enumerate(seg_ids):
                embedding_dict[seg_id] = emb[i].detach().clone()
    return embedding_dict

Key behaviors:

Processes all batches under torch.no_grad() for efficiency.
Checks whether all segment IDs in a batch are already computed, skipping redundant computation.
Each embedding is detached and cloned to prevent memory leaks from the computation graph.

Scoring with Cosine Similarity

similarity = torch.nn.CosineSimilarity(dim=-1, eps=1e-6)

for i, line in enumerate(veri_test):
    lab_pair = int(line.split(" ")[0].rstrip().split(".")[0].strip())
    enrol_id = line.split(" ")[1].rstrip().split(".")[0].strip()
    test_id = line.split(" ")[2].rstrip().split(".")[0].strip()

    enrol = enrol_dict[enrol_id]
    test = test_dict[test_id]

    score = similarity(enrol, test)[0]

    if lab_pair == 1:
        positive_scores.append(score)
    else:
        negative_scores.append(score)

Score Normalization

When "score_norm" is set in params, the function applies normalization using a training cohort:

# Z-norm: normalize by enrollment impostor statistics
if params["score_norm"] == "z-norm":
    enrol_rep = enrol.repeat(train_cohort.shape[0], 1, 1)
    score_e_c = similarity(enrol_rep, train_cohort)
    if "cohort_size" in params:
        score_e_c = torch.topk(score_e_c, k=params["cohort_size"], dim=0)[0]
    mean_e_c = torch.mean(score_e_c, dim=0)
    std_e_c = torch.std(score_e_c, dim=0)
    score = (score - mean_e_c) / std_e_c

# T-norm: normalize by test impostor statistics
elif params["score_norm"] == "t-norm":
    score = (score - mean_t_c) / std_t_c

# S-norm: symmetric (average of z-norm and t-norm)
elif params["score_norm"] == "s-norm":
    score_e = (score - mean_e_c) / std_e_c
    score_t = (score - mean_t_c) / std_t_c
    score = 0.5 * (score_e + score_t)

Usage Example

import os
import sys
import torch
import speechbrain as sb
from hyperpyyaml import load_hyperpyyaml
from speechbrain.utils.metric_stats import EER, minDCF

# Load params and pretrained model
params_file, run_opts, overrides = sb.core.parse_arguments(sys.argv[1:])
with open(params_file) as fin:
    params = load_hyperpyyaml(fin, overrides)

params["pretrainer"].collect_files()
params["pretrainer"].load_collected()
params["embedding_model"].eval()

# Create dataloaders
train_dataloader, enrol_dataloader, test_dataloader = dataio_prep(params)

# Compute embeddings
enrol_dict = compute_embedding_loop(enrol_dataloader)
test_dict = compute_embedding_loop(test_dataloader)

# Load verification trials
with open(veri_file_path) as f:
    veri_test = [line.rstrip() for line in f]

# Compute scores
positive_scores, negative_scores = get_verification_scores(veri_test)

# Evaluate
eer, th = EER(torch.tensor(positive_scores), torch.tensor(negative_scores))
min_dcf, th = minDCF(torch.tensor(positive_scores), torch.tensor(negative_scores))
print(f"EER: {eer * 100:.2f}%")
print(f"minDCF: {min_dcf:.4f}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment

API Signatures

get_verification_scores

compute_embedding_loop

Description

Parameters

get_verification_scores

compute_embedding_loop

Inputs

Outputs

Implementation Details

Embedding Loop

Scoring with Cosine Similarity

Score Normalization

Usage Example

See Also

Related Pages

Page Connections