Principle:Marker Inc Korea AutoRAG Web Interface Deployment

Knowledge Sources	AutoRAG Docs
Domains	RAG Pipeline Deployment, User Interface
Last Updated	2026-02-12 00:00 GMT

Overview

Web interface deployment provides a browser-based chat UI for the RAG pipeline, enabling non-technical users to interact with the system through a conversational interface without writing code or sending HTTP requests.

Description

While REST API deployment targets system-to-system integration, web interface deployment targets human users who need direct, interactive access to the pipeline. This is especially valuable during development, stakeholder demos, and user acceptance testing, where the overhead of building a custom frontend is not justified.

The web interface wraps the same pipeline execution logic used by the synchronous Runner into a chat interface. Users type a question, the pipeline processes it through all modules (retrieval, reranking, prompt construction, generation), and the answer is displayed in a conversational format. The chat metaphor is a natural fit for RAG systems, where users ask questions and receive knowledge-grounded answers.

The interface is built on Gradio, a Python library that generates web UIs from Python functions. Gradio's ChatInterface component provides a message input, conversation history display, and basic controls. The runner wraps the pipeline execution in a simple callback function that takes a message string and returns the generated answer. This minimal adapter pattern means the web interface adds almost no additional complexity on top of the core pipeline.

Usage

Use web interface deployment for interactive exploration, demos, internal tools, or any situation where users need to query the pipeline without writing code. The share parameter enables creating a temporary public URL via Gradio's sharing infrastructure, which is useful for sharing demos with remote stakeholders. For production-facing web applications, the REST API deployment is generally preferred as it provides more control over the frontend.

Theoretical Basis

The web interface follows a thin adapter pattern between the Gradio chat framework and the pipeline runner:

User Input (browser) -> Gradio ChatInterface
    -> callback(message, history) -> Runner.run(message)
    -> Generated Answer -> Gradio ChatInterface
User Output (browser) <- Displayed in chat bubble

Key design properties:

Stateless execution: Each message is processed independently through the pipeline. The chat history parameter is accepted by the Gradio callback signature but is not used by the runner, as the RAG pipeline processes each query in isolation.
Synchronous blocking: Unlike the async ApiRunner, the GradioRunner uses synchronous execution. Gradio handles its own threading internally to keep the UI responsive.
Minimal UI controls: The retry and undo buttons are explicitly disabled, as re-running the same query through a RAG pipeline is deterministic (given the same retrieval state) and undoing a read-only operation is not meaningful.
Zero-config sharing: Gradio's built-in share mode tunnels the local server to a public URL, enabling instant demos without any infrastructure.

Related Pages

Implemented By

Implementation:Marker_Inc_Korea_AutoRAG_GradioRunner_Run_Web

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment