Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Infiniflow Ragflow Metadata Utils

From Leeroopedia
Revision as of 11:22, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Infiniflow_Ragflow_Metadata_Utils.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Data_Processing, Search
Last Updated 2026-02-12 06:00 GMT

Overview

Concrete tool for metadata filtering, transformation, and JSON schema generation used in document retrieval provided by the RAGFlow common library.

Description

The metadata_utils module provides functions for filtering document metadata against complex conditions (contains, in, starts with, equality, comparison operators), applying metadata filters with auto/semi-auto/manual resolution modes, deduplicating lists, updating metadata dictionaries, and generating JSON schemas from metadata structures.

Usage

Import these utilities when implementing metadata-based document filtering in retrieval pipelines, when building search queries that incorporate metadata constraints, or when converting metadata definitions to JSON schema for API validation.

Code Reference

Source Location

Signature

def meta_filter(metas: dict, filters: list, logic: str = "and") -> bool:
    """Filter metadata dict against a list of conditions with AND/OR logic."""

async def apply_meta_data_filter(
    meta_data_filter: dict,
    metas: dict,
    question: str,
    chat_mdl=None,
    base_doc_ids: list = None,
    manual_value_resolver=None,
) -> list:
    """Apply metadata filters with auto/semi_auto/manual resolution modes."""

def metadata_schema(metadata: dict) -> dict:
    """Generate JSON schema from metadata structure."""

def turn2jsonschema(obj) -> dict:
    """Convert metadata object to JSON schema format."""

Import

from common.metadata_utils import meta_filter, apply_meta_data_filter, metadata_schema

I/O Contract

Inputs

Name Type Required Description
metas dict Yes Document metadata dictionary to filter
filters list Yes List of filter conditions with field, operator, value
logic str No Logical combinator: "and" or "or" (default: "and")
question str Yes User query for auto-resolution mode
chat_mdl object No Chat model for auto-resolving filter values

Outputs

Name Type Description
meta_filter() returns bool Whether metadata passes all filter conditions
apply_meta_data_filter() returns list List of matching document IDs
metadata_schema() returns dict JSON schema representation

Usage Examples

from common.metadata_utils import meta_filter, metadata_schema

# Filter documents by metadata
doc_meta = {"author": "John", "year": 2024, "tags": ["AI", "RAG"]}
filters = [
    {"field": "author", "operator": "=", "value": "John"},
    {"field": "year", "operator": ">", "value": 2020},
]
matches = meta_filter(doc_meta, filters, logic="and")
# Returns True

# Generate JSON schema
schema = metadata_schema({"title": "string", "pages": "integer"})

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment