Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Huggingface Optimum Accelerated Pipeline Configuration

From Leeroopedia

Overview

Configuration interface for creating hardware-accelerated inference pipelines with automatic backend selection and parameter routing.

Description

The accelerated pipeline extends HuggingFace's transformers.pipeline() API with an additional accelerator parameter that routes inference to optimized backends. The configuration accepts all standard pipeline parameters (task, model, tokenizer, device, etc.) plus the accelerator choice.

When the accelerator parameter is not specified, the system auto-detects the best available backend in priority order:

  1. OpenVINO ("ov") -- if optimum-intel[openvino] is installed
  2. ONNX Runtime ("ort") -- if optimum-onnx[onnxruntime] is installed
  3. IPEX ("ipex") -- if optimum-intel[ipex] is installed

If none of these backends are available, an ImportError is raised with installation instructions.

Supported Accelerator Values

Value Backend Required Package
"ort" ONNX Runtime optimum-onnx[onnxruntime]
"ov" OpenVINO optimum-intel[openvino]
"ipex" Intel Extension for PyTorch optimum-intel[ipex]
None Auto-detect (priority order above) At least one of the above

Usage

Use when creating inference pipelines that should leverage hardware acceleration.

from optimum.pipelines import pipeline

# Explicit accelerator selection
pipe = pipeline("text-classification", model="distilbert-base-uncased", accelerator="ort")

# Auto-detect best available backend
pipe = pipeline("sentiment-analysis", model="distilbert-base-uncased")

# With additional standard parameters
pipe = pipeline(
    "question-answering",
    model="distilbert/distilbert-base-cased-distilled-squad",
    tokenizer="google-bert/bert-base-cased",
    accelerator="ipex",
    device="cpu",
)

Theoretical Basis

Decorator/facade pattern over transformers.Pipeline. The pipeline() function acts as a router that:

  1. Validates the accelerator choice (must be "ort", "ov", "ipex", or None)
  2. Checks backend availability using the detection functions from optimum.utils.import_utils
  3. Delegates to the appropriate backend-specific pipeline constructor

The function preserves full API compatibility with transformers.pipeline(), meaning existing pipeline code can be migrated to accelerated inference by simply changing the import and adding the accelerator parameter.

Parameter Categories

Category Parameters Description
Task task Defines the pipeline type (e.g., "text-classification", "question-answering")
Model model, config, revision Model identifier and configuration
Preprocessing tokenizer, feature_extractor, image_processor, processor Input preprocessing components
Runtime device, device_map, torch_dtype, framework Execution environment settings
Security token, trust_remote_code Authentication and code trust settings
Acceleration accelerator Optimum-specific: selects the hardware acceleration backend

Related

Connections

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment