Implementation:Run llama Llama index LlamaDataset Base
| Knowledge Sources | |
|---|---|
| Domains | Datasets, Evaluation, Benchmarking |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
This module defines the base classes for the LlamaIndex dataset system, providing abstract foundations for data examples, predictions, and datasets that support batch prediction generation with both synchronous and asynchronous execution.
Description
The llama_dataset/base.py module establishes the core dataset abstraction layer for LlamaIndex evaluation and benchmarking workflows. It defines several interconnected base classes:
CreatedByType is an Enum with values HUMAN and AI, used to track whether data examples were generated by a human or an AI model. CreatedBy is a Pydantic model pairing a CreatedByType with an optional model_name field for AI-generated content attribution.
BaseLlamaExamplePrediction is the abstract base model for individual predictions. It requires subclasses to implement a class_name property. BaseLlamaDataExample is the abstract base model for individual data examples in a dataset, also requiring a class_name property.
BaseLlamaPredictionDataset is an abstract base model for collections of predictions. It holds a list of BaseLlamaExamplePrediction objects, supports indexing and slicing via __getitem__, and provides save_json and from_json methods for serialization. Subclasses must define _prediction_type as a class variable and implement to_pandas for DataFrame conversion.
BaseLlamaDataset is the primary abstract base class, generic over a predictor type P (which can be a BaseQueryEngine, BaseEvaluator, or LLM). It stores a list of BaseLlamaDataExample objects and maintains a private _predictions_cache for resilient batch processing. Key methods include:
- make_predictions_with - Synchronous batch prediction with progress tracking and configurable batch sizes
- amake_predictions_with - Asynchronous batch prediction using asyncio.gather with rate limit error handling and automatic caching
- _predict_example and _apredict_example - Abstract methods that subclasses must implement for individual example prediction
- _construct_prediction_dataset - Abstract factory method for building the appropriate prediction dataset type
- _batch_examples - Generator that yields batches of examples for processing
The caching mechanism in both sync and async prediction methods allows resumption after rate limit errors: if a RateLimitError is encountered during async processing, the cache preserves completed predictions so that re-executing the method will skip already-predicted examples.
Usage
Use these base classes when creating custom dataset types for evaluation or benchmarking in LlamaIndex. Subclass BaseLlamaDataset and BaseLlamaDataExample to define domain-specific datasets (e.g., RAG evaluation datasets, evaluator evaluation datasets). Use make_predictions_with or amake_predictions_with to generate predictions in bulk against a query engine, evaluator, or LLM.
Code Reference
Source Location
- Repository: Run_llama_Llama_index
- File: llama-index-core/llama_index/core/llama_dataset/base.py
- Lines: 1-356
Signature
PredictorType = Union[BaseQueryEngine, BaseEvaluator, LLM]
P = TypeVar("P", bound=PredictorType)
class CreatedByType(str, Enum):
HUMAN = "human"
AI = "ai"
class CreatedBy(BaseModel):
model_name: Optional[str] = Field(default_factory=str)
type: CreatedByType
class BaseLlamaExamplePrediction(BaseModel):
@property
@abstractmethod
def class_name(self) -> str: ...
class BaseLlamaDataExample(BaseModel):
@property
@abstractmethod
def class_name(self) -> str: ...
class BaseLlamaPredictionDataset(BaseModel):
_prediction_type: ClassVar[Type[BaseLlamaExamplePrediction]]
predictions: List[BaseLlamaExamplePrediction] = Field(default_factory=list)
class BaseLlamaDataset(BaseModel, Generic[P]):
_example_type: ClassVar[Type[BaseLlamaDataExample]]
examples: List[BaseLlamaDataExample] = Field(default=[])
Import
from llama_index.core.llama_dataset.base import (
BaseLlamaDataset,
BaseLlamaDataExample,
BaseLlamaExamplePrediction,
BaseLlamaPredictionDataset,
CreatedBy,
CreatedByType,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| examples | List[BaseLlamaDataExample] | No | List of data examples in the dataset (default empty) |
| predictions | List[BaseLlamaExamplePrediction] | No | List of predictions (default empty) |
| predictor | P (BaseQueryEngine, BaseEvaluator, or LLM) | Yes (for make_predictions_with) | The predictor used to generate predictions |
| show_progress | bool | No | Whether to display a tqdm progress bar (default False) |
| batch_size | int | No | Number of examples per batch for async processing (default 20) |
| sleep_time_in_seconds | int | No | Delay between batches to avoid rate limits (default 0 sync, 1 async) |
| path | str | Yes (for save_json/from_json) | File path for JSON serialization and deserialization |
Outputs
| Name | Type | Description |
|---|---|---|
| return (make_predictions_with) | BaseLlamaPredictionDataset | A dataset containing all generated predictions |
| return (amake_predictions_with) | BaseLlamaPredictionDataset | A dataset containing all async-generated predictions |
| return (to_pandas) | pandas.DataFrame | A DataFrame representation of examples or predictions |
| return (from_json) | BaseLlamaDataset or BaseLlamaPredictionDataset | A deserialized dataset instance from JSON |
Usage Examples
Basic Usage
from llama_index.core.llama_dataset.base import (
BaseLlamaDataset,
BaseLlamaDataExample,
BaseLlamaExamplePrediction,
BaseLlamaPredictionDataset,
CreatedBy,
CreatedByType,
)
# Track the origin of data
created_by_human = CreatedBy(type=CreatedByType.HUMAN)
created_by_ai = CreatedBy(type=CreatedByType.AI, model_name="gpt-4")
print(created_by_human) # "human"
print(created_by_ai) # "ai (gpt-4)"
Working with a Concrete Dataset
from llama_index.core.llama_dataset import LabelledRagDataset
# Load a dataset from JSON
dataset = LabelledRagDataset.from_json("my_rag_dataset.json")
# Access examples via indexing
first_example = dataset[0]
subset = dataset[0:5]
# Generate predictions with a query engine
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
# Synchronous batch prediction
prediction_dataset = dataset.make_predictions_with(
query_engine,
show_progress=True,
batch_size=10,
)
# Convert to pandas DataFrame
df = prediction_dataset.to_pandas()
# Save predictions
prediction_dataset.save_json("predictions.json")