Implementation:Speechbrain Speechbrain Prepare Fisher Callhome
| Knowledge Sources | |
|---|---|
| Domains | Speech_Translation, Data_Preparation |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for preparing the Fisher-Callhome-Spanish dataset for speech translation provided by the SpeechBrain library.
Description
This script prepares JSON manifest files for the Fisher-Callhome-Spanish dataset, which contains Spanish conversational telephone speech with English translations. It processes LDC-distributed speech and transcription data, resamples audio to 16 kHz, segments utterances by channel and timestamp, and normalizes transcripts using Moses punctuation normalization and tokenization. The output is suitable for training Spanish-to-English speech translation systems.
Usage
Use this when preparing the Fisher-Callhome-Spanish dataset for speech translation training with SpeechBrain recipes.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/Fisher-Callhome-Spanish/fisher_callhome_prepare.py
Signature
def prepare_fisher_callhome_spanish(
data_folder: str, save_folder: str, device: str = "cpu"
):
Import
from recipes.Fisher_Callhome_Spanish.fisher_callhome_prepare import prepare_fisher_callhome_spanish
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data_folder | str | Yes | Path to the folder where the Fisher-Callhome-Spanish dataset is stored |
| save_folder | str | Yes | Path where train/valid/test specification files will be saved |
| device | str | No | Device for computation, e.g. "cpu" or "cuda" (default: "cpu") |
Outputs
| Name | Type | Description |
|---|---|---|
| train.json | JSON | Training split manifest with audio paths, Spanish transcripts, and English translations |
| valid.json | JSON | Validation split manifest |
| test.json | JSON | Test split manifest |
Usage Examples
from recipes.Fisher_Callhome_Spanish.fisher_callhome_prepare import prepare_fisher_callhome_spanish
prepare_fisher_callhome_spanish(
data_folder="/path/to/fisher-callhome",
save_folder="/path/to/output",
)