Principle:Marker Inc Korea AutoRAG Deployment Mode Selection

Knowledge Sources	AutoRAG Docs
Domains	Deployment, API_Design
Last Updated	2026-02-08 06:00 GMT

Overview

A deployment pattern that provides multiple execution modes for running an optimized RAG pipeline: code, API server, or web interface.

Description

After initialization, AutoRAG pipelines can be deployed in three modes: Code Runner (programmatic access via Runner.run), API Server (REST endpoints via ApiRunner), and Web Interface (interactive chat via GradioRunner or Streamlit). Code mode is for integration into Python applications. API mode exposes /v1/run, /v1/retrieve, and /v1/stream endpoints. Web mode provides a chat-like interface for end users. Each mode uses the same underlying module chain but with different input/output interfaces.

Usage

Choose the deployment mode based on the use case: code mode for batch processing or embedding in applications, API mode for microservice architectures, or web mode for demos and user-facing applications.

Theoretical Basis

All deployment modes share the same execution pattern:

Create a pseudo QA DataFrame from the user query
Sequentially run each module instance on the previous result
Merge module outputs into the growing result DataFrame
Extract the final output from the specified result column

The difference lies only in the input/output interface, not the pipeline execution.

Related Pages

Implemented By

Implementation:Marker_Inc_Korea_AutoRAG_Runner_Run

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment