Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Huggingface Optimum Accelerated Inference Environment

From Leeroopedia
Knowledge Sources
Domains Inference, Infrastructure
Last Updated 2026-02-15 00:00 GMT

Overview

Accelerated inference environment requiring at least one backend: ONNX Runtime (`optimum-onnx`), OpenVINO (`optimum-intel[openvino]`), or Intel IPEX (`optimum-intel[ipex]`).

Description

This environment provides the backend infrastructure for accelerated inference pipelines in Optimum. The pipeline factory auto-detects which inference backend is available and dispatches accordingly. At least one of three backends must be installed: ONNX Runtime (via `optimum-onnx`), OpenVINO (via `optimum-intel`), or Intel IPEX (via `optimum-intel`). The backend selection follows a priority order: OpenVINO > ONNX Runtime > IPEX when no explicit accelerator is specified.

Usage

Use this environment when running the Accelerated Inference Pipeline workflow. The `optimum.pipelines.pipeline()` function requires at least one backend to be available. If no `accelerator` parameter is specified, the system auto-selects based on what is installed.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows Depends on chosen backend
Hardware CPU minimum; GPU optional ONNX Runtime GPU variant for GPU acceleration
Disk Sufficient for exported model ONNX/OpenVINO models stored on disk

Dependencies

Backend Option 1: ONNX Runtime

  • `optimum-onnx` (optimum ONNX subpackage)
  • `onnxruntime` (or one of 16 distribution variants: onnxruntime-gpu, onnxruntime-rocm, onnxruntime-openvino, etc.)

Backend Option 2: OpenVINO

  • `optimum-intel` >= 1.23.0 with `[openvino]` extra
  • `openvino` (Intel OpenVINO toolkit)

Backend Option 3: Intel IPEX

  • `optimum-intel` >= 1.23.0 with `[ipex]` extra
  • `intel_extension_for_pytorch` (IPEX)

Core Dependencies (all backends)

  • `torch` >= 2.1.0
  • `transformers` >= 4.36.0

Credentials

  • `HF_TOKEN`: HuggingFace API token if loading models from gated repositories.

Quick Install

# Option 1: ONNX Runtime backend
pip install optimum[onnxruntime]

# Option 1b: ONNX Runtime with GPU support
pip install optimum[onnxruntime-gpu]

# Option 2: OpenVINO backend
pip install optimum[openvino]

# Option 3: Intel IPEX backend
pip install optimum[ipex]

Code Evidence

Backend auto-detection priority from `optimum/pipelines/__init__.py:216-238`:

if accelerator is None:
    if is_optimum_intel_available() and is_openvino_available():
        logger.info("`accelerator` not specified. Using OpenVINO (`ov`)...")
        accelerator = "ov"
    elif is_optimum_onnx_available() and is_onnxruntime_available():
        logger.info("`accelerator` not specified. Using ONNX Runtime (`ort`)...")
        accelerator = "ort"
    elif is_optimum_intel_available() and is_ipex_available():
        logger.info("`accelerator` not specified. Using IPEX (`ipex`)...")
        accelerator = "ipex"
    else:
        raise ImportError(
            "You need to install either `optimum-onnx[onnxruntime]` to use ONNX Runtime, "
            "or `optimum-intel[openvino]` to use OpenVINO, "
            "or `optimum-intel[ipex]` to use IPEX."
        )

Backend availability checks from `optimum/utils/import_utils.py:82-119`:

_onnxruntime_available = _is_package_available(
    "onnxruntime",
    pkg_distributions=[
        "onnxruntime-gpu",
        "onnxruntime-rocm",
        "onnxruntime-training",
        "onnxruntime-openvino",
        "onnxruntime-vitisai",
        "onnxruntime-armnn",
        "onnxruntime-cann",
        # ... 16 variants total
    ],
)

Backend dispatch from `optimum/pipelines/__init__.py:240-264`:

if accelerator == "ort":
    from optimum.onnxruntime import pipeline as ort_pipeline
    return ort_pipeline(task=task, model=model, ...)
elif accelerator in ["ov", "ipex"]:
    from optimum.intel import pipeline as intel_pipeline
    return intel_pipeline(task=task, model=model, ...)

Common Errors

Error Message Cause Solution
`You need to install either optimum-onnx[onnxruntime]...` No inference backend installed Install at least one backend: `pip install optimum[onnxruntime]`
`Cannot export model using PyTorch because no PyTorch package was found` PyTorch not installed `pip install torch>=2.1.0`
`Only one framework is supported for export: pt` Non-PyTorch framework specified Use `framework="pt"` (only PyTorch is supported)

Compatibility Notes

  • Backend priority: When no accelerator is specified, OpenVINO is preferred over ONNX Runtime, which is preferred over IPEX.
  • ONNX Runtime variants: The library checks 16+ ONNX Runtime distributions to support diverse hardware (AMD ROCm, Xilinx Vitis AI, ARM NN, Huawei CANN, Qualcomm QNN, etc.).
  • Framework limitation: Only PyTorch (`"pt"`) is supported as the source framework for model export and inference. TensorFlow support has been removed.
  • Intel ecosystem: Both OpenVINO and IPEX are provided through the same `optimum-intel` package with different extras.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment