Principle:Huggingface Optimum Accelerated Pipeline Configuration

Overview

Configuration interface for creating hardware-accelerated inference pipelines with automatic backend selection and parameter routing.

Description

The accelerated pipeline extends HuggingFace's transformers.pipeline() API with an additional accelerator parameter that routes inference to optimized backends. The configuration accepts all standard pipeline parameters (task, model, tokenizer, device, etc.) plus the accelerator choice.

When the accelerator parameter is not specified, the system auto-detects the best available backend in priority order:

OpenVINO ("ov") -- if optimum-intel[openvino] is installed
ONNX Runtime ("ort") -- if optimum-onnx[onnxruntime] is installed
IPEX ("ipex") -- if optimum-intel[ipex] is installed

If none of these backends are available, an ImportError is raised with installation instructions.

Supported Accelerator Values

Value	Backend	Required Package
`"ort"`	ONNX Runtime	`optimum-onnx[onnxruntime]`
`"ov"`	OpenVINO	`optimum-intel[openvino]`
`"ipex"`	Intel Extension for PyTorch	`optimum-intel[ipex]`
`None`	Auto-detect (priority order above)	At least one of the above

Usage

Use when creating inference pipelines that should leverage hardware acceleration.

from optimum.pipelines import pipeline

# Explicit accelerator selection
pipe = pipeline("text-classification", model="distilbert-base-uncased", accelerator="ort")

# Auto-detect best available backend
pipe = pipeline("sentiment-analysis", model="distilbert-base-uncased")

# With additional standard parameters
pipe = pipeline(
    "question-answering",
    model="distilbert/distilbert-base-cased-distilled-squad",
    tokenizer="google-bert/bert-base-cased",
    accelerator="ipex",
    device="cpu",
)

Theoretical Basis

Decorator/facade pattern over transformers.Pipeline. The pipeline() function acts as a router that:

Validates the accelerator choice (must be "ort", "ov", "ipex", or None)
Checks backend availability using the detection functions from optimum.utils.import_utils
Delegates to the appropriate backend-specific pipeline constructor

The function preserves full API compatibility with transformers.pipeline(), meaning existing pipeline code can be migrated to accelerated inference by simply changing the import and adding the accelerator parameter.

Parameter Categories

Category	Parameters	Description
Task	`task`	Defines the pipeline type (e.g., `"text-classification"`, `"question-answering"`)
Model	`model`, `config`, `revision`	Model identifier and configuration
Preprocessing	`tokenizer`, `feature_extractor`, `image_processor`, `processor`	Input preprocessing components
Runtime	`device`, `device_map`, `torch_dtype`, `framework`	Execution environment settings
Security	`token`, `trust_remote_code`	Authentication and code trust settings
Acceleration	`accelerator`	Optimum-specific: selects the hardware acceleration backend

Connections

Implementation:Huggingface_Optimum_Pipeline_Factory

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment