Environment:Intel Ipex llm RAG LlamaIndex Environment

Knowledge Sources	IPEX-LLM LlamaIndex
Domains	Infrastructure, RAG
Last Updated	2026-02-09 04:00 GMT

Overview

Intel XPU environment with LlamaIndex, IPEX-LLM, PostgreSQL vector store, and sentence-transformers for Retrieval-Augmented Generation on Intel GPUs.

Description

This environment provides an Intel XPU-accelerated context for RAG (Retrieval-Augmented Generation) pipelines using LlamaIndex. It integrates IPEX-LLM as the inference backend for the LLM component, sentence-transformers for embedding generation, and PostgreSQL (via psycopg2) as the vector store backend. The environment enables building end-to-end RAG workflows where documents are chunked, embedded, stored in a PostgreSQL-based vector index, and retrieved to augment LLM generation with relevant context.

Usage

Use this environment for any LlamaIndex RAG workflow that requires Intel XPU acceleration. It is the mandatory prerequisite for running LlamaIndex-based document indexing, vector similarity search, and retrieval-augmented generation with IPEX-LLM on Intel GPUs.

System Requirements

Category	Requirement	Notes
OS	Ubuntu 22.04 LTS	Intel OneAPI base toolkit required
Hardware	Intel GPU (Arc/Flex/Max)	XPU device for LLM inference and embedding generation
GPU Driver	Intel GPU drivers	Level Zero runtime required
Database	PostgreSQL	Required for vector store backend; pgvector extension recommended

Dependencies

System Packages

Intel OneAPI Base Toolkit
`intel-opencl-icd`
`intel-level-zero-gpu`
PostgreSQL server (with pgvector extension)

Python Packages

`ipex-llm[xpu]` (pre-release)
`torch` (XPU variant)
`intel_extension_for_pytorch` (XPU variant)
`llama-index`
`llama-index-core`
`llama-index-readers-file`
`psycopg2` (or `psycopg2-binary` for PostgreSQL connectivity)
`sentence-transformers`
`transformers`

Credentials

The following may be required depending on your PostgreSQL and model configuration:

PostgreSQL Connection: Database host, port, username, password, and database name for the vector store backend.
HuggingFace Model Access: If using gated models (e.g., Llama), a `HF_TOKEN` environment variable may be needed.

Quick Install

# Source Intel OneAPI environment
source /opt/intel/oneapi/setvars.sh

# Install IPEX-LLM with XPU support
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

# Install LlamaIndex RAG dependencies
pip install llama-index llama-index-core llama-index-readers-file psycopg2-binary sentence-transformers transformers

# Set runtime environment
export SYCL_CACHE_PERSISTENT=1

Common Errors

Error Message	Cause	Solution
`ModuleNotFoundError: No module named 'llama_index'`	LlamaIndex not installed	`pip install llama-index`
`psycopg2.OperationalError: could not connect to server`	PostgreSQL not running or misconfigured	Start PostgreSQL and verify connection credentials
`RuntimeError: No XPU device found`	Intel GPU drivers not installed	Install Intel GPU drivers and Level Zero runtime
`sentence_transformers not found`	Sentence-transformers not installed	`pip install sentence-transformers`

Compatibility Notes

Intel XPU Only: Both the LLM inference and embedding generation run on Intel XPU. The environment is not compatible with CUDA devices.
PostgreSQL Vector Store: The pgvector extension for PostgreSQL provides efficient vector similarity search. Ensure PostgreSQL is configured with the pgvector extension.
LlamaIndex Version: LlamaIndex v0.10+ uses a modular package structure (`llama-index-core`, `llama-index-readers-file`, etc.).

Related Pages

Implementation:Intel_Ipex_llm_LlamaIndex_RAG

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment