Implementation:Marker Inc Korea AutoRAG Api Runner Run Api Server

Knowledge Sources	AutoRAG
Domains	Deployment, API_Design
Last Updated	2026-02-08 06:00 GMT

Overview

Concrete tool for serving an optimized RAG pipeline as a REST API or web interface provided by AutoRAG's deploy modules.

Description

ApiRunner extends BaseRunner with a Quart HTTP application. run_api_server starts the server with endpoints: POST /v1/run (full pipeline), POST /v1/retrieve (retrieval only), POST /v1/stream (SSE streaming), GET /version. Request/response models use Pydantic: QueryRequest, RunResponse, RetrievalResponse, StreamResponse, VersionResponse. GradioRunner.run_web provides a Gradio chat interface.

Usage

Use ApiRunner.from_yaml or ApiRunner.from_trial_folder to initialize, then call run_api_server to start. Use GradioRunner similarly for a web chat interface.

Code Reference

Source Location

Repository: AutoRAG
File: autorag/deploy/api.py (ApiRunner), autorag/deploy/gradio.py (GradioRunner)
Lines: api.py L25-248, gradio.py L15-41

Signature

class ApiRunner(BaseRunner):
    def __init__(self, config: Dict, project_dir: Optional[str] = None):
        """Initialize with config and set up Quart routes."""

    def run_api_server(
        self,
        host: str = "0.0.0.0",
        port: int = 8000,
        remote: bool = True,
        **kwargs
    ) -> None:
        """
        Start the REST API server.

        Args:
            host: Server bind address.
            port: Server port (default 8000).
            remote: Expose via ngrok tunnel (default True).
        """

class GradioRunner(BaseRunner):
    def run_web(
        self,
        server_name: str = "0.0.0.0",
        server_port: int = 7680,
        share: bool = False,
        **kwargs
    ) -> None:
        """
        Launch a Gradio chat web interface.

        Args:
            server_name: Server bind address.
            server_port: Server port (default 7680).
            share: Create public Gradio link.
        """

Import

from autorag.deploy.api import ApiRunner
from autorag.deploy.gradio import GradioRunner

I/O Contract

Inputs

Name	Type	Required	Description
config	Dict	Yes	Pipeline config (one module per node)
host	str	No	Server bind address (default: 0.0.0.0)
port	int	No	Server port (default: 8000 for API, 7680 for Gradio)
remote	bool	No	Expose via ngrok (default: True, API only)

Outputs

Name	Type	Description
API endpoints	HTTP	/v1/run, /v1/retrieve, /v1/stream, /version
Gradio UI	Web	Interactive chat interface

Usage Examples

Start API Server

from autorag.deploy.api import ApiRunner

# Initialize from trial folder
api_runner = ApiRunner.from_trial_folder("./my_project/0")

# Start REST API server
api_runner.run_api_server(host="0.0.0.0", port=8000, remote=False)

# API endpoints:
# POST /v1/run    {"query": "What is RAG?"}
# POST /v1/retrieve {"query": "What is RAG?"}
# POST /v1/stream {"query": "What is RAG?"}
# GET  /version

Start Gradio Web Interface

from autorag.deploy.gradio import GradioRunner

gradio_runner = GradioRunner.from_trial_folder("./my_project/0")
gradio_runner.run_web(server_port=7680)

Related Pages

Implements Principle

Principle:Marker_Inc_Korea_AutoRAG_Serve_And_Monitor

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment