Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Intel Ipex llm RAG LangChain Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, RAG
Last Updated 2026-02-09 12:00 GMT

Overview

Intel XPU environment with LangChain, IPEX-LLM LLM/Embedding integrations, and Chroma vector store for Retrieval-Augmented Generation on Intel GPUs.

Description

This environment provides an Intel XPU-accelerated context for RAG (Retrieval-Augmented Generation) pipelines using LangChain. It uses `IpexLLM` as the LangChain LLM wrapper and `IpexLLMBgeEmbeddings` for BGE embedding model acceleration on Intel GPUs. The vector store uses Chroma for in-memory similarity search. The environment requires an Intel GPU with XPU support for both the LLM inference and embedding generation components.

Usage

Use this environment for any RAG With LangChain workflow that requires Intel XPU acceleration. It is the mandatory prerequisite for running the IPEX-LLM LangChain integrations including `IpexLLM.from_model_id()`, `IpexLLMBgeEmbeddings`, and the LCEL RAG chain assembly.

System Requirements

Category Requirement Notes
OS Ubuntu 22.04 LTS Intel OneAPI base toolkit required
Hardware Intel GPU (Arc/Flex/Max) XPU device for both LLM and embedding model
GPU Driver Intel GPU drivers Level Zero runtime required

Dependencies

System Packages

  • Intel OneAPI Base Toolkit
  • `intel-opencl-icd`
  • `intel-level-zero-gpu`

Python Packages

  • `ipex-llm[xpu]` (pre-release)
  • `torch` (XPU variant)
  • `intel_extension_for_pytorch` (XPU variant)
  • `langchain`
  • `langchain-text-splitters`
  • `langchain-community` (provides `IpexLLMBgeEmbeddings`, `IpexLLM`)
  • `langchain-core`
  • `langchain-chroma` (provides `Chroma` vector store)
  • `langchainhub` (for pulling prompt templates)
  • `chromadb`
  • `transformers`

Credentials

No API keys or tokens are required for local RAG with local models. However:

  • HuggingFace Model Access: If using gated models (e.g., Llama), a `HF_TOKEN` environment variable may be needed.
  • LangChain Hub: Pulling prompts from `langchain hub` (e.g., `hub.pull("rlm/rag-prompt")`) requires internet access but no API key.

Quick Install

# Source Intel OneAPI environment
source /opt/intel/oneapi/setvars.sh

# Install IPEX-LLM with XPU support
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

# Install LangChain RAG dependencies
pip install langchain langchain-text-splitters langchain-community langchain-core langchain-chroma langchainhub chromadb transformers

Code Evidence

LangChain IPEX-LLM imports from `rag.py:27-33`:

from langchain import hub
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.embeddings import IpexLLMBgeEmbeddings
from langchain_community.llms import IpexLLM
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_chroma import Chroma

XPU device usage for embeddings from `rag.py:60-63`:

embeddings = IpexLLMBgeEmbeddings(
    model_name=embed_model_path,
    model_kwargs={"device": "xpu"},
    encode_kwargs={"normalize_embeddings": True},
)

XPU device usage for LLM from `rag.py:67-75`:

llm = IpexLLM.from_model_id(
    model_id=model_path,
    model_kwargs={
        "temperature": 0,
        "max_length": 512,
        "trust_remote_code": True,
        "device": "xpu",
    },
)

Common Errors

Error Message Cause Solution
`UserWarning: padding_mask` Benign HuggingFace warning during inference Suppress with `warnings.filterwarnings("ignore", category=UserWarning, message=".*padding_mask.*")`
`ModuleNotFoundError: langchain_community` LangChain community package not installed `pip install langchain-community`
`ModuleNotFoundError: langchain_chroma` Chroma LangChain integration not installed `pip install langchain-chroma chromadb`
`XPU device not found` Intel GPU drivers not installed Install Intel OneAPI toolkit and GPU drivers

Compatibility Notes

  • Intel XPU Only: Both the LLM and embedding models run on Intel XPU. The `device="xpu"` parameter is required in both `model_kwargs` dictionaries.
  • BGE Embeddings: The `IpexLLMBgeEmbeddings` class is specifically designed for BAAI BGE embedding models. Use `normalize_embeddings=True` for cosine similarity in the vector store.
  • Chroma In-Memory: The default Chroma setup is in-memory. For persistent storage, configure a Chroma directory.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment