Implementation:Intel Ipex llm NPU Multimodal MiniCPM

Knowledge Sources	Intel IPEX-LLM
Domains	Multimodal, NPU, Vision_Language
Last Updated	2026-02-09 04:00 GMT

Overview

Concrete tool for running multimodal vision-language inference on Intel NPU using the MiniCPM-Llama3-V-2.5 model with IPEX-LLM.

Description

This script loads a multimodal MiniCPM-Llama3-V-2.5 model on Intel NPU using IPEX-LLM's NPU-specific AutoModel. It processes both text prompts and images (from URL or local path), using the model's built-in chat method for multimodal generation. Supports configurable quantization and context length parameters for NPU optimization.

Usage

Use this when running multimodal vision-language models on Intel NPU hardware. The pattern demonstrated here applies to other multimodal models supported by IPEX-LLM's NPU backend.

Code Reference

Source Location

Repository: Intel IPEX-LLM
File: python/llm/example/NPU/HF-Transformers-AutoModels/Multimodal/minicpm-llama3-v2.5.py
Lines: 1-106

Signature

# Script-based execution with argparse
# Key API:
from ipex_llm.transformers.npu_model import AutoModel

model = AutoModel.from_pretrained(
    model_path,
    load_in_low_bit=args.low_bit,
    optimize_model=True,
    trust_remote_code=True,
    max_context_len=args.max_context_len,
    max_prompt_len=args.max_prompt_len,
)
response = model.chat(tokenizer, image, msgs, ...)

Import

from ipex_llm.transformers.npu_model import AutoModel
from transformers import AutoTokenizer
from PIL import Image

I/O Contract

Inputs

Name	Type	Required	Description
repo-id-or-model-path	str	Yes	Model path (e.g., openbmb/MiniCPM-Llama3-V-2.5)
image-url-or-path	str	No	Image URL or local file path
prompt	str	No	Text prompt for image description
low-bit	str	No	Quantization type (default: sym_int8)

Outputs

Name	Type	Description
Generated text	Console	Multimodal model response describing the image
Timing	Console	Inference latency

Usage Examples

Multimodal NPU Inference

python minicpm-llama3-v2.5.py \
    --repo-id-or-model-path "openbmb/MiniCPM-Llama3-V-2.5" \
    --image-url-or-path "https://example.com/cat.jpg" \
    --prompt "What do you see in this image?"

Related Pages

Environment:Intel_Ipex_llm_NPU_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment