Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Marker Inc Korea AutoRAG Load Yaml Config

From Leeroopedia
Knowledge Sources
Domains Configuration Management, RAG Pipeline Optimization
Last Updated 2026-02-12 00:00 GMT

Overview

Concrete tool for loading and preprocessing YAML pipeline configuration files provided by the AutoRAG framework.

Description

The load_yaml_config function reads a YAML configuration file from disk and returns a fully processed Python dictionary. It performs three operations in sequence: safe YAML parsing to prevent code injection, recursive conversion of string-encoded tuples into native Python tuples, and recursive substitution of ${VAR} environment variable patterns with their runtime values. The function validates that the file exists before attempting to read it, raising a ValueError with a descriptive message if the path is invalid or the YAML content is malformed.

Usage

Import and call load_yaml_config whenever you need to load an AutoRAG pipeline configuration from a YAML file. This is typically done at the beginning of an optimization trial (inside Evaluator.start_trial), during validation (inside Validator.validate), or when programmatically inspecting a configuration file. The returned dictionary contains the full pipeline definition including node_lines, vectordb settings, and any custom parameters.

Code Reference

Source Location

  • Repository: AutoRAG
  • File: autorag/utils/util.py (lines 689-707)

Signature

def load_yaml_config(yaml_path: str) -> Dict:
    """
    Load a YAML configuration file for AutoRAG.
    It contains safe loading, converting string to tuple, and insert environment variables.

    :param yaml_path: The path of the YAML configuration file.
    :return: The loaded configuration dictionary.
    """

Import

from autorag.utils.util import load_yaml_config

I/O Contract

Inputs

Name Type Required Description
yaml_path str yes Absolute or relative path to the YAML configuration file. Must point to an existing file with valid YAML content.

Outputs

Name Type Description
config_dict Dict Fully processed configuration dictionary with node_lines, vectordb, and other top-level keys. String tuples are converted to native tuples and environment variables are resolved.

Usage Examples

Basic Usage

from autorag.utils.util import load_yaml_config

# Load a pipeline configuration
config = load_yaml_config("config/my_pipeline.yaml")

# Access node lines
node_lines = config["node_lines"]
for node_line in node_lines:
    print(f"Node line: {node_line['node_line_name']}")
    for node in node_line["nodes"]:
        print(f"  Node type: {node['node_type']}")
        print(f"  Modules: {[m['module_type'] for m in node['modules']]}")

With Environment Variables

import os
from autorag.utils.util import load_yaml_config

# Set environment variables that the YAML references via ${VAR} syntax
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["EMBEDDING_MODEL"] = "text-embedding-3-small"

# Environment variables in the YAML are automatically substituted
config = load_yaml_config("config/pipeline_with_env.yaml")

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment