Implementation:Intel Ipex llm LlamaIndex RAG

Knowledge Sources	Intel IPEX-LLM LlamaIndex
Domains	RAG, Vector_Store, LlamaIndex
Last Updated	2026-02-09 04:00 GMT

Overview

Concrete tool for building a Retrieval-Augmented Generation pipeline using LlamaIndex with IPEX-LLM embeddings and LLM on Intel XPU.

Description

This script implements a complete RAG pipeline using LlamaIndex: PDF document loading via PyMuPDFReader, sentence-level text splitting, BGE embeddings via IpexLLMEmbedding, PostgreSQL-backed vector storage (PGVectorStore), custom retrieval via VectorDBRetriever, and question answering via IpexLLM as the generation backend. It provides end-to-end document ingestion and querying with IPEX-LLM optimizations.

Usage

Use this when building a RAG application that requires PDF document ingestion with PostgreSQL vector storage and IPEX-LLM acceleration for both embedding generation and text generation on Intel hardware.

Code Reference

Source Location

Repository: Intel IPEX-LLM
File: python/llm/example/GPU/LlamaIndex/rag.py
Lines: 1-252

Signature

class VectorDBRetriever(BaseRetriever):
    def _retrieve(self, query_bundle: QueryBundle) -> list:
        """Retrieve similar documents from PostgreSQL vector store."""

def load_vector_database(username, password) -> PGVectorStore:
    """Create or connect to PostgreSQL vector store."""

def load_data(data_path) -> list:
    """Load PDF and split into sentence chunks."""

def main(args):
    """Main RAG pipeline orchestration."""

Import

from llama_index.embeddings.ipex_llm import IpexLLMEmbedding
from llama_index.llms.ipex_llm import IpexLLM
from llama_index.vector_stores.postgres import PGVectorStore
from llama_index.core import RetrieverQueryEngine

I/O Contract

Inputs

Name	Type	Required	Description
model-path	str	Yes	Path to transformer model for generation
tokenizer-path	str	Yes	Path to tokenizer
embedding-model-path	str	No	Embedding model (default: BAAI/bge-small-en)
data	str	No	PDF file path (default: ./data/llama2.pdf)
question	str	No	Query question
user	str	Yes	PostgreSQL username
password	str	Yes	PostgreSQL password

Outputs

Name	Type	Description
RAG response	Console	Generated answer with retrieved context
Vector store	PostgreSQL	Persistent document embeddings

Usage Examples

RAG Query

python rag.py \
    -m "/path/to/llama2-model" \
    -t "/path/to/tokenizer" \
    -e "BAAI/bge-small-en" \
    -d "./data/llama2.pdf" \
    -u "postgres" \
    -p "password" \
    -q "How does Llama 2 perform compared to other models?"

Related Pages

Environment:Intel_Ipex_llm_RAG_LlamaIndex_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment