Implementation:Marker Inc Korea AutoRAG Load Yaml Config
| Knowledge Sources | |
|---|---|
| Domains | Configuration Management, RAG Pipeline Optimization |
| Last Updated | 2026-02-12 00:00 GMT |
Overview
Concrete tool for loading and preprocessing YAML pipeline configuration files provided by the AutoRAG framework.
Description
The load_yaml_config function reads a YAML configuration file from disk and returns a fully processed Python dictionary. It performs three operations in sequence: safe YAML parsing to prevent code injection, recursive conversion of string-encoded tuples into native Python tuples, and recursive substitution of ${VAR} environment variable patterns with their runtime values. The function validates that the file exists before attempting to read it, raising a ValueError with a descriptive message if the path is invalid or the YAML content is malformed.
Usage
Import and call load_yaml_config whenever you need to load an AutoRAG pipeline configuration from a YAML file. This is typically done at the beginning of an optimization trial (inside Evaluator.start_trial), during validation (inside Validator.validate), or when programmatically inspecting a configuration file. The returned dictionary contains the full pipeline definition including node_lines, vectordb settings, and any custom parameters.
Code Reference
Source Location
- Repository: AutoRAG
- File: autorag/utils/util.py (lines 689-707)
Signature
def load_yaml_config(yaml_path: str) -> Dict:
"""
Load a YAML configuration file for AutoRAG.
It contains safe loading, converting string to tuple, and insert environment variables.
:param yaml_path: The path of the YAML configuration file.
:return: The loaded configuration dictionary.
"""
Import
from autorag.utils.util import load_yaml_config
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| yaml_path | str | yes | Absolute or relative path to the YAML configuration file. Must point to an existing file with valid YAML content. |
Outputs
| Name | Type | Description |
|---|---|---|
| config_dict | Dict | Fully processed configuration dictionary with node_lines, vectordb, and other top-level keys. String tuples are converted to native tuples and environment variables are resolved. |
Usage Examples
Basic Usage
from autorag.utils.util import load_yaml_config
# Load a pipeline configuration
config = load_yaml_config("config/my_pipeline.yaml")
# Access node lines
node_lines = config["node_lines"]
for node_line in node_lines:
print(f"Node line: {node_line['node_line_name']}")
for node in node_line["nodes"]:
print(f" Node type: {node['node_type']}")
print(f" Modules: {[m['module_type'] for m in node['modules']]}")
With Environment Variables
import os
from autorag.utils.util import load_yaml_config
# Set environment variables that the YAML references via ${VAR} syntax
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["EMBEDDING_MODEL"] = "text-embedding-3-small"
# Environment variables in the YAML are automatically substituted
config = load_yaml_config("config/pipeline_with_env.yaml")