Environment:Intel Ipex llm RAG LangChain Environment

Knowledge Sources	IPEX-LLM LangChain
Domains	Infrastructure, RAG
Last Updated	2026-02-09 12:00 GMT

Overview

Intel XPU environment with LangChain, IPEX-LLM LLM/Embedding integrations, and Chroma vector store for Retrieval-Augmented Generation on Intel GPUs.

Description

This environment provides an Intel XPU-accelerated context for RAG (Retrieval-Augmented Generation) pipelines using LangChain. It uses `IpexLLM` as the LangChain LLM wrapper and `IpexLLMBgeEmbeddings` for BGE embedding model acceleration on Intel GPUs. The vector store uses Chroma for in-memory similarity search. The environment requires an Intel GPU with XPU support for both the LLM inference and embedding generation components.

Usage

Use this environment for any RAG With LangChain workflow that requires Intel XPU acceleration. It is the mandatory prerequisite for running the IPEX-LLM LangChain integrations including `IpexLLM.from_model_id()`, `IpexLLMBgeEmbeddings`, and the LCEL RAG chain assembly.

System Requirements

Category	Requirement	Notes
OS	Ubuntu 22.04 LTS	Intel OneAPI base toolkit required
Hardware	Intel GPU (Arc/Flex/Max)	XPU device for both LLM and embedding model
GPU Driver	Intel GPU drivers	Level Zero runtime required

Dependencies

System Packages

Intel OneAPI Base Toolkit
`intel-opencl-icd`
`intel-level-zero-gpu`

Python Packages

`ipex-llm[xpu]` (pre-release)
`torch` (XPU variant)
`intel_extension_for_pytorch` (XPU variant)
`langchain`
`langchain-text-splitters`
`langchain-community` (provides `IpexLLMBgeEmbeddings`, `IpexLLM`)
`langchain-core`
`langchain-chroma` (provides `Chroma` vector store)
`langchainhub` (for pulling prompt templates)
`chromadb`
`transformers`

Credentials

No API keys or tokens are required for local RAG with local models. However:

HuggingFace Model Access: If using gated models (e.g., Llama), a `HF_TOKEN` environment variable may be needed.
LangChain Hub: Pulling prompts from `langchain hub` (e.g., `hub.pull("rlm/rag-prompt")`) requires internet access but no API key.

Quick Install

# Source Intel OneAPI environment
source /opt/intel/oneapi/setvars.sh

# Install IPEX-LLM with XPU support
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

# Install LangChain RAG dependencies
pip install langchain langchain-text-splitters langchain-community langchain-core langchain-chroma langchainhub chromadb transformers

Code Evidence

LangChain IPEX-LLM imports from `rag.py:27-33`:

from langchain import hub
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.embeddings import IpexLLMBgeEmbeddings
from langchain_community.llms import IpexLLM
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_chroma import Chroma

XPU device usage for embeddings from `rag.py:60-63`:

embeddings = IpexLLMBgeEmbeddings(
    model_name=embed_model_path,
    model_kwargs={"device": "xpu"},
    encode_kwargs={"normalize_embeddings": True},
)

XPU device usage for LLM from `rag.py:67-75`:

llm = IpexLLM.from_model_id(
    model_id=model_path,
    model_kwargs={
        "temperature": 0,
        "max_length": 512,
        "trust_remote_code": True,
        "device": "xpu",
    },
)

Common Errors

Error Message	Cause	Solution
`UserWarning: padding_mask`	Benign HuggingFace warning during inference	Suppress with `warnings.filterwarnings("ignore", category=UserWarning, message=".padding_mask.")`
`ModuleNotFoundError: langchain_community`	LangChain community package not installed	`pip install langchain-community`
`ModuleNotFoundError: langchain_chroma`	Chroma LangChain integration not installed	`pip install langchain-chroma chromadb`
`XPU device not found`	Intel GPU drivers not installed	Install Intel OneAPI toolkit and GPU drivers

Compatibility Notes

Intel XPU Only: Both the LLM and embedding models run on Intel XPU. The `device="xpu"` parameter is required in both `model_kwargs` dictionaries.
BGE Embeddings: The `IpexLLMBgeEmbeddings` class is specifically designed for BAAI BGE embedding models. Use `normalize_embeddings=True` for cosine similarity in the vector store.
Chroma In-Memory: The default Chroma setup is in-memory. For persistent storage, configure a Chroma directory.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment