Principle:Marker Inc Korea AutoRAG YAML Configuration Authoring
| Knowledge Sources | |
|---|---|
| Domains | Configuration, Pipeline_Design |
| Last Updated | 2026-02-08 06:00 GMT |
Overview
A configuration-driven design pattern that defines RAG pipeline architectures as declarative YAML specifications.
Description
AutoRAG uses YAML files to declare the entire RAG pipeline configuration. A YAML config file defines node_lines (sequential stages), each containing nodes (processing steps like retrieval, reranking, generation), each containing modules (specific implementations to evaluate). Each node also includes a strategy that defines how to select the best module based on evaluation metrics. This declarative approach separates pipeline design from implementation, enabling automated combinatorial evaluation without writing code.
Usage
Use this principle to define the RAG pipeline configurations to evaluate. Author YAML configs before running the optimization trial. The YAML structure determines which module combinations will be tested and how the best modules will be selected.
Theoretical Basis
The YAML configuration follows a hierarchical structure:
# Abstract YAML hierarchy
config = {
"node_lines": [
{
"node_line_name": "retrieve_and_generate",
"nodes": [
{
"node_type": "retrieval", # What kind of processing
"strategy": {"metrics": [...]}, # How to select best
"modules": [ # What to evaluate
{"module_type": "bm25"},
{"module_type": "vectordb"},
]
}
]
}
]
}
The system evaluates all module combinations within each node and selects the best based on the strategy.