| Knowledge Sources |
Domains |
Last Updated
|
| explodinggradients/ragas |
LLM Evaluation, Test Data Generation, Persona Modeling |
2026-02-10
|
Overview
Description
The generate_personas_from_kg function generates a list of diverse user personas from a knowledge graph. It filters nodes with summary embeddings, clusters them by cosine similarity, selects representative summaries, and uses an LLM to synthesize persona descriptions. The function returns a list of Persona objects, each containing a name and role description. The supporting Persona Pydantic model and PersonaGenerationPrompt prompt class are also part of this implementation.
Usage
This function is called automatically by TestsetGenerator.generate() when persona_list is None. It can also be called directly for standalone persona generation workflows.
Code Reference
Source Location
| Component |
File |
Lines
|
default_filter |
src/ragas/testset/persona.py |
L16-22
|
Persona |
src/ragas/testset/persona.py |
L25-27
|
PersonaGenerationPrompt |
src/ragas/testset/persona.py |
L30-48
|
PersonaList |
src/ragas/testset/persona.py |
L51-58
|
generate_personas_from_kg |
src/ragas/testset/persona.py |
L61-150
|
Signature
class Persona(BaseModel):
name: str
role_description: str
class PersonaGenerationPrompt(PydanticPrompt[StringIO, Persona]):
instruction: str = (
"Using the provided summary, generate a single persona who would likely "
"interact with or benefit from the content. Include a unique name and a "
"concise role description of who they are."
)
input_model: Type[StringIO] = StringIO
output_model: Type[Persona] = Persona
def generate_personas_from_kg(
kg: KnowledgeGraph,
llm: BaseRagasLLM,
persona_generation_prompt: PersonaGenerationPrompt = PersonaGenerationPrompt(),
num_personas: int = 3,
filter_fn: Callable[[Node], bool] = default_filter,
callbacks: Callbacks = [],
) -> List[Persona]:
...
Import
from ragas.testset.persona import generate_personas_from_kg, Persona, PersonaGenerationPrompt
I/O Contract
Input Parameters
| Parameter |
Type |
Default |
Description
|
kg |
KnowledgeGraph |
(required) |
The knowledge graph containing nodes with summaries and embeddings
|
llm |
BaseRagasLLM |
(required) |
The LLM used to generate persona descriptions from summaries
|
persona_generation_prompt |
PersonaGenerationPrompt |
PersonaGenerationPrompt() |
The prompt template for persona generation; includes few-shot example
|
num_personas |
int |
3 |
Maximum number of personas to generate
|
filter_fn |
Callable[[Node], bool] |
default_filter |
Selects nodes of type DOCUMENT or CHUNK that have a summary_embedding property
|
callbacks |
Callbacks |
[] |
LangChain-style callbacks for monitoring the generation process
|
Output
| Return Type |
Description
|
List[Persona] |
A list of Persona objects, each with name (str) and role_description (str) fields
|
Persona Model
| Field |
Type |
Description
|
name |
str |
A descriptive name for the persona (e.g., "Data Scientist", "Product Manager")
|
role_description |
str |
A concise description of the persona's role and how they interact with the content
|
Exceptions
| Exception |
Condition
|
ValueError |
No nodes pass the filter function (no nodes with summary_embedding)
|
Usage Examples
Generating Personas From an Enriched Knowledge Graph
from ragas.testset.graph import KnowledgeGraph
from ragas.testset.persona import generate_personas_from_kg, Persona
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI
# Load an enriched knowledge graph
kg = KnowledgeGraph.load("enriched_graph.json")
# Wrap an LLM
llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
# Generate 5 personas
personas = generate_personas_from_kg(
kg=kg,
llm=llm,
num_personas=5,
)
for persona in personas:
print(f"{persona.name}: {persona.role_description}")
# Example output:
# Data Scientist: Analyzes complex datasets and builds predictive models.
# DevOps Engineer: Manages deployment pipelines and infrastructure automation.
# Product Manager: Defines product strategy and coordinates cross-functional teams.
Using a Custom Filter Function
from ragas.testset.graph import NodeType
# Only use DOCUMENT nodes (exclude chunks)
def document_only_filter(node):
return (
node.type == NodeType.DOCUMENT
and node.properties.get("summary_embedding") is not None
)
personas = generate_personas_from_kg(
kg=kg,
llm=llm,
num_personas=3,
filter_fn=document_only_filter,
)
Using a Custom Prompt
from ragas.testset.persona import PersonaGenerationPrompt, Persona
from ragas.prompt import StringIO
custom_prompt = PersonaGenerationPrompt(
instruction=(
"Based on the provided summary, generate a persona representing "
"a technical expert who would deeply engage with this content."
),
examples=[
(
StringIO(text="A guide to Kubernetes orchestration and container management."),
Persona(
name="Cloud Infrastructure Architect",
role_description="Designs and manages cloud-native infrastructure at scale.",
),
)
],
)
personas = generate_personas_from_kg(
kg=kg,
llm=llm,
persona_generation_prompt=custom_prompt,
num_personas=3,
)
Related Pages