Principle:Huggingface Optimum Accelerated Pipeline Configuration
Overview
Configuration interface for creating hardware-accelerated inference pipelines with automatic backend selection and parameter routing.
Description
The accelerated pipeline extends HuggingFace's transformers.pipeline() API with an additional accelerator parameter that routes inference to optimized backends. The configuration accepts all standard pipeline parameters (task, model, tokenizer, device, etc.) plus the accelerator choice.
When the accelerator parameter is not specified, the system auto-detects the best available backend in priority order:
- OpenVINO (
"ov") -- ifoptimum-intel[openvino]is installed - ONNX Runtime (
"ort") -- ifoptimum-onnx[onnxruntime]is installed - IPEX (
"ipex") -- ifoptimum-intel[ipex]is installed
If none of these backends are available, an ImportError is raised with installation instructions.
Supported Accelerator Values
| Value | Backend | Required Package |
|---|---|---|
"ort"
|
ONNX Runtime | optimum-onnx[onnxruntime]
|
"ov"
|
OpenVINO | optimum-intel[openvino]
|
"ipex"
|
Intel Extension for PyTorch | optimum-intel[ipex]
|
None
|
Auto-detect (priority order above) | At least one of the above |
Usage
Use when creating inference pipelines that should leverage hardware acceleration.
from optimum.pipelines import pipeline
# Explicit accelerator selection
pipe = pipeline("text-classification", model="distilbert-base-uncased", accelerator="ort")
# Auto-detect best available backend
pipe = pipeline("sentiment-analysis", model="distilbert-base-uncased")
# With additional standard parameters
pipe = pipeline(
"question-answering",
model="distilbert/distilbert-base-cased-distilled-squad",
tokenizer="google-bert/bert-base-cased",
accelerator="ipex",
device="cpu",
)
Theoretical Basis
Decorator/facade pattern over transformers.Pipeline. The pipeline() function acts as a router that:
- Validates the accelerator choice (must be
"ort","ov","ipex", orNone) - Checks backend availability using the detection functions from
optimum.utils.import_utils - Delegates to the appropriate backend-specific pipeline constructor
The function preserves full API compatibility with transformers.pipeline(), meaning existing pipeline code can be migrated to accelerated inference by simply changing the import and adding the accelerator parameter.
Parameter Categories
| Category | Parameters | Description |
|---|---|---|
| Task | task
|
Defines the pipeline type (e.g., "text-classification", "question-answering")
|
| Model | model, config, revision
|
Model identifier and configuration |
| Preprocessing | tokenizer, feature_extractor, image_processor, processor
|
Input preprocessing components |
| Runtime | device, device_map, torch_dtype, framework
|
Execution environment settings |
| Security | token, trust_remote_code
|
Authentication and code trust settings |
| Acceleration | accelerator
|
Optimum-specific: selects the hardware acceleration backend |
Related
- implemented_by → Implementation:Huggingface_Optimum_Pipeline_Factory