Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Huggingface Optimum Backend Specific Pipeline

From Leeroopedia

Overview

Wrapper Doc -- This page documents the dispatch logic in Optimum that routes pipeline creation to backend-specific implementations. The actual pipeline implementations live in external packages (optimum-onnx and optimum-intel).

Source

File: optimum/pipelines/__init__.py L240-286

Repository: optimum

APIs

ONNX Runtime Dispatch (L240-262)

if accelerator == "ort":
    from optimum.onnxruntime import pipeline as ort_pipeline  # type: ignore

    return ort_pipeline(
        task=task,
        model=model,
        config=config,
        tokenizer=tokenizer,
        feature_extractor=feature_extractor,
        image_processor=image_processor,
        processor=processor,
        framework=framework,
        revision=revision,
        use_fast=use_fast,
        token=token,
        device=device,
        device_map=device_map,
        torch_dtype=torch_dtype,
        trust_remote_code=trust_remote_code,
        model_kwargs=model_kwargs,
        pipeline_class=pipeline_class,
        **kwargs,
    )

Function: ort_pipeline(task, model, ...)

Import: from optimum.onnxruntime import pipeline as ort_pipeline

Notes: The ONNX Runtime dispatch does not forward the accelerator parameter, since the optimum.onnxruntime package only handles one backend.

Intel Dispatch (L263-286)

elif accelerator in ["ov", "ipex"]:
    from optimum.intel import pipeline as intel_pipeline  # type: ignore

    return intel_pipeline(
        task=task,
        model=model,
        config=config,
        tokenizer=tokenizer,
        feature_extractor=feature_extractor,
        image_processor=image_processor,
        processor=processor,
        framework=framework,
        revision=revision,
        use_fast=use_fast,
        token=token,
        device=device,
        device_map=device_map,
        torch_dtype=torch_dtype,
        trust_remote_code=trust_remote_code,
        model_kwargs=model_kwargs,
        pipeline_class=pipeline_class,
        accelerator=accelerator,
        **kwargs,
    )

Function: intel_pipeline(task, model, ..., accelerator)

Import: from optimum.intel import pipeline as intel_pipeline

Notes: The Intel dispatch forwards the accelerator parameter (either "ov" or "ipex") so that the optimum.intel package can select the correct backend internally.

Error Handling (L287-288)

else:
    raise ValueError(f"Accelerator {accelerator} not recognized. Please use 'ort', 'ov' or 'ipex'.")

External References

Package Installation Description
optimum.onnxruntime pip install optimum[onnxruntime] ONNX Runtime pipeline implementation. Provides ORTModel* classes and the pipeline() function for ORT-backed inference.
optimum.intel (OpenVINO) pip install optimum-intel[openvino] OpenVINO pipeline implementation. Provides OVModel* classes and pipeline support via the accelerator="ov" parameter.
optimum.intel (IPEX) pip install optimum-intel[ipex] IPEX pipeline implementation. Provides IPEX-optimized model classes and pipeline support via the accelerator="ipex" parameter.

Dispatch Flow Diagram

The dispatch follows this decision tree:

pipeline(accelerator=...)
    |
    +-- accelerator == "ort"
    |       --> from optimum.onnxruntime import pipeline as ort_pipeline
    |       --> return ort_pipeline(task, model, ...)
    |
    +-- accelerator in ["ov", "ipex"]
    |       --> from optimum.intel import pipeline as intel_pipeline
    |       --> return intel_pipeline(task, model, ..., accelerator=accelerator)
    |
    +-- otherwise
            --> raise ValueError("Accelerator not recognized")

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment