Implementation:Intel Ipex llm NPU Speech Paraformer
| Knowledge Sources | |
|---|---|
| Domains | Speech_To_Text, NPU, Audio |
| Last Updated | 2026-02-09 04:00 GMT |
Overview
Concrete tool for speech-to-text transcription on Intel NPU using the Paraformer-Large ASR model with IPEX-LLM's FunAsr integration.
Description
This script loads a Paraformer-Large speech recognition model using IPEX-LLM's FunAsrAutoModel for NPU-accelerated inference. It accepts audio file paths and generates text transcriptions with configurable quantization (default sym_int8) and batch processing parameters. The FunAsr integration provides automatic audio preprocessing and decoding.
Usage
Use this when performing speech-to-text transcription on Intel NPU hardware. The FunAsrAutoModel provides a simplified interface for ASR models that handles audio preprocessing, feature extraction, and decoding internally.
Code Reference
Source Location
- Repository: Intel IPEX-LLM
- File: python/llm/example/NPU/HF-Transformers-AutoModels/Multimodal/speech_paraformer-large.py
- Lines: 1-60
Signature
# Script-based execution with argparse
# Key API:
from ipex_llm.transformers.npu_model import FunAsrAutoModel as AutoModel
model = AutoModel(
model=model_path,
load_in_low_bit=args.low_bit,
)
result = model.generate(
input=args.input,
batch_size_s=args.batch_size,
)
Import
from ipex_llm.transformers.npu_model import FunAsrAutoModel as AutoModel
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repo-id-or-model-path | str | Yes | Path to Paraformer ASR model |
| input | str | Yes | Audio file path for transcription |
| low-bit | str | No | Quantization type (default: sym_int8) |
| batch-size | int | No | Batch size for processing |
Outputs
| Name | Type | Description |
|---|---|---|
| Transcription | Console | Text transcription of audio input |
| Timing | Console | Inference latency |
Usage Examples
Speech-to-Text on NPU
python speech_paraformer-large.py \
--repo-id-or-model-path "iic/speech_paraformer-large" \
--input "./audio/sample.wav" \
--low-bit "sym_int8"