Implementation:Facebookresearch Audiocraft ViSQOL Wrapper
| Knowledge Sources | |
|---|---|
| Domains | Audio_Metrics, Speech_Quality |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
Concrete tool for computing ViSQOL (Virtual Speech Quality Objective Listener) perceptual quality scores by wrapping Google's ViSQOL binary.
Description
ViSQOL provides a Python wrapper around the external ViSQOL binary. It handles audio resampling to the required sample rate (48 kHz for audio mode, 16 kHz for speech mode), temporary file management, subprocess execution, and result parsing. It supports both audio quality assessment and speech quality assessment modes.
Usage
Import this class when computing perceptual audio quality metrics for compression model evaluation. Requires the ViSQOL binary to be installed.
Code Reference
Source Location
- Repository: Facebookresearch_Audiocraft
- File: audiocraft/metrics/visqol.py
- Lines: 1-216
Signature
class ViSQOL:
def __init__(self, bin: tp.Union[Path, str], mode: str = "audio",
model: str = "libsvm_nu_svr_model.txt", debug: bool = False): ...
def __call__(self, ref_sig: torch.Tensor, deg_sig: torch.Tensor, sr: int,
pad_with_silence: bool = False) -> float: ...
Import
from audiocraft.metrics.visqol import ViSQOL
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| ref_sig | torch.Tensor | Yes | Reference audio [B, C, T] |
| deg_sig | torch.Tensor | Yes | Degraded/generated audio [B, C, T] |
| sr | int | Yes | Sample rate of input audio |
Outputs
| Name | Type | Description |
|---|---|---|
| score | float | Average MOSLQO quality score |