Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Explodinggradients Ragas Generate Personas From KG

From Leeroopedia


Knowledge Sources Domains Last Updated
explodinggradients/ragas LLM Evaluation, Test Data Generation, Persona Modeling 2026-02-10

Overview

Description

The generate_personas_from_kg function generates a list of diverse user personas from a knowledge graph. It filters nodes with summary embeddings, clusters them by cosine similarity, selects representative summaries, and uses an LLM to synthesize persona descriptions. The function returns a list of Persona objects, each containing a name and role description. The supporting Persona Pydantic model and PersonaGenerationPrompt prompt class are also part of this implementation.

Usage

This function is called automatically by TestsetGenerator.generate() when persona_list is None. It can also be called directly for standalone persona generation workflows.

Code Reference

Source Location

Component File Lines
default_filter src/ragas/testset/persona.py L16-22
Persona src/ragas/testset/persona.py L25-27
PersonaGenerationPrompt src/ragas/testset/persona.py L30-48
PersonaList src/ragas/testset/persona.py L51-58
generate_personas_from_kg src/ragas/testset/persona.py L61-150

Signature

class Persona(BaseModel):
    name: str
    role_description: str

class PersonaGenerationPrompt(PydanticPrompt[StringIO, Persona]):
    instruction: str = (
        "Using the provided summary, generate a single persona who would likely "
        "interact with or benefit from the content. Include a unique name and a "
        "concise role description of who they are."
    )
    input_model: Type[StringIO] = StringIO
    output_model: Type[Persona] = Persona

def generate_personas_from_kg(
    kg: KnowledgeGraph,
    llm: BaseRagasLLM,
    persona_generation_prompt: PersonaGenerationPrompt = PersonaGenerationPrompt(),
    num_personas: int = 3,
    filter_fn: Callable[[Node], bool] = default_filter,
    callbacks: Callbacks = [],
) -> List[Persona]:
    ...

Import

from ragas.testset.persona import generate_personas_from_kg, Persona, PersonaGenerationPrompt

I/O Contract

Input Parameters

Parameter Type Default Description
kg KnowledgeGraph (required) The knowledge graph containing nodes with summaries and embeddings
llm BaseRagasLLM (required) The LLM used to generate persona descriptions from summaries
persona_generation_prompt PersonaGenerationPrompt PersonaGenerationPrompt() The prompt template for persona generation; includes few-shot example
num_personas int 3 Maximum number of personas to generate
filter_fn Callable[[Node], bool] default_filter Selects nodes of type DOCUMENT or CHUNK that have a summary_embedding property
callbacks Callbacks [] LangChain-style callbacks for monitoring the generation process

Output

Return Type Description
List[Persona] A list of Persona objects, each with name (str) and role_description (str) fields

Persona Model

Field Type Description
name str A descriptive name for the persona (e.g., "Data Scientist", "Product Manager")
role_description str A concise description of the persona's role and how they interact with the content

Exceptions

Exception Condition
ValueError No nodes pass the filter function (no nodes with summary_embedding)

Usage Examples

Generating Personas From an Enriched Knowledge Graph

from ragas.testset.graph import KnowledgeGraph
from ragas.testset.persona import generate_personas_from_kg, Persona
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI

# Load an enriched knowledge graph
kg = KnowledgeGraph.load("enriched_graph.json")

# Wrap an LLM
llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))

# Generate 5 personas
personas = generate_personas_from_kg(
    kg=kg,
    llm=llm,
    num_personas=5,
)

for persona in personas:
    print(f"{persona.name}: {persona.role_description}")
# Example output:
# Data Scientist: Analyzes complex datasets and builds predictive models.
# DevOps Engineer: Manages deployment pipelines and infrastructure automation.
# Product Manager: Defines product strategy and coordinates cross-functional teams.

Using a Custom Filter Function

from ragas.testset.graph import NodeType

# Only use DOCUMENT nodes (exclude chunks)
def document_only_filter(node):
    return (
        node.type == NodeType.DOCUMENT
        and node.properties.get("summary_embedding") is not None
    )

personas = generate_personas_from_kg(
    kg=kg,
    llm=llm,
    num_personas=3,
    filter_fn=document_only_filter,
)

Using a Custom Prompt

from ragas.testset.persona import PersonaGenerationPrompt, Persona
from ragas.prompt import StringIO

custom_prompt = PersonaGenerationPrompt(
    instruction=(
        "Based on the provided summary, generate a persona representing "
        "a technical expert who would deeply engage with this content."
    ),
    examples=[
        (
            StringIO(text="A guide to Kubernetes orchestration and container management."),
            Persona(
                name="Cloud Infrastructure Architect",
                role_description="Designs and manages cloud-native infrastructure at scale.",
            ),
        )
    ],
)

personas = generate_personas_from_kg(
    kg=kg,
    llm=llm,
    persona_generation_prompt=custom_prompt,
    num_personas=3,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment