Workflow:Infiniflow Ragflow Docker Deployment

Knowledge Sources	RAGFlow RAGFlow Docs Docker README
Domains	DevOps, Docker, Deployment
Last Updated	2026-02-12 06:00 GMT

Overview

End-to-end process for deploying RAGFlow using Docker Compose, including infrastructure services setup, environment configuration, server startup, and verification of all system components.

Description

This workflow covers the complete deployment of RAGFlow using Docker. RAGFlow is a multi-service application that requires several infrastructure components: MySQL for metadata storage, Elasticsearch (or Infinity/OpenSearch) for document search and vector storage, Redis for caching and task queuing, and MinIO for object storage. The deployment uses Docker Compose to orchestrate all services with proper networking, volume management, and health checks. The workflow covers prerequisites verification, environment configuration, service startup, LLM provider configuration, and deployment verification. It also covers optional configurations like GPU acceleration, document engine switching, and Kubernetes deployment via Helm charts.

Usage

Execute this workflow when setting up a new RAGFlow instance for development, testing, or production use. This is the first workflow any new RAGFlow user or operator must complete before using any other RAGFlow features. It is also needed when upgrading to a new version or changing deployment configuration.

Execution Steps

Step 1: Verify Prerequisites

Ensure the host system meets the minimum requirements for running RAGFlow. This includes hardware requirements (CPU, RAM, disk), software dependencies (Docker, Docker Compose), and kernel parameter configuration for Elasticsearch.

Requirements:

CPU >= 4 cores
RAM >= 16 GB
Disk >= 50 GB
Docker >= 24.0.0 and Docker Compose >= v2.26.1
vm.max_map_count >= 262144 (for Elasticsearch)
Optional: gVisor for code execution sandbox

Key considerations:

The vm.max_map_count kernel parameter must be set persistently in /etc/sysctl.conf
Docker images are built for x86 platforms only; ARM64 requires building from source
gVisor (runsc) is only needed if you plan to use the code executor agent feature

Step 2: Configure Environment

Clone the repository and configure environment variables. The primary configuration file is docker/.env which controls ports, passwords, image versions, and feature flags. The service_conf.yaml.template defines backend service connections and is auto-populated with environment variables at container startup.

Key configuration files:

docker/.env - System-level settings (HTTP port, MySQL/MinIO passwords, image tag, DOC_ENGINE, DEVICE)
docker/service_conf.yaml.template - Backend service configuration template
docker/docker-compose.yml - Main Docker Compose service definitions
docker/docker-compose-base.yml - Infrastructure-only services for development

Key considerations:

Set RAGFLOW_IMAGE to the desired version tag
Configure MYSQL_PASSWORD and MINIO_PASSWORD for security
Set DOC_ENGINE to elasticsearch (default) or infinity for alternative document store
Set DEVICE to gpu for GPU-accelerated document processing
Configure user_default_llm in service_conf.yaml.template with your preferred LLM provider

Step 3: Start Infrastructure Services

Launch the infrastructure services using Docker Compose. This starts MySQL, Elasticsearch (or the configured document engine), Redis, and MinIO. Each service has health checks configured to ensure proper startup before dependent services begin.

What happens:

MySQL initializes the database schema on first run
Elasticsearch creates indexes with configured mappings
Redis starts the caching and message queue service
MinIO initializes the object storage buckets
Health checks verify each service is responding before proceeding

Step 4: Start RAGFlow Server

Launch the RAGFlow application container which runs the entrypoint.sh script to start all application components. The entrypoint orchestrates the startup of the web server (Flask API behind Nginx), task executor workers, optional data sync service, optional MCP server, and optional admin server.

What happens:

service_conf.yaml is generated from the template with environment variable substitution
Nginx starts as the reverse proxy on the configured HTTP port
The Flask API server starts and initializes database schema, seed data, and plugins
Task executor workers start and begin polling for document processing tasks
Background threads start for progress monitoring and maintenance

Step 5: Configure LLM Provider

Set up at least one LLM provider with a valid API key. This can be done through the web UI user settings page or by pre-configuring the service_conf.yaml.template. RAGFlow supports 66+ LLM providers including OpenAI, Anthropic, Google, Azure, local models via Ollama, and many more. Both chat models and embedding models need to be configured.

Key considerations:

At minimum, configure a chat model and an embedding model
API keys can be set per-user through the settings UI or system-wide via configuration
Different providers offer different model capabilities (embedding dimensions, context windows, etc.)
Local model providers like Ollama can be used for air-gapped deployments

Step 6: Verify Deployment

Confirm the deployment is working correctly by checking service health, accessing the web UI, creating a test knowledge base, and running a test chat. Monitor logs for any errors during startup.

Verification steps:

Check container logs for the RAGFlow ASCII banner and successful startup message
Access the web UI at http://HOST:PORT
Register a user account or log in
Verify LLM provider configuration in user settings
Create a test knowledge base and upload a sample document
Verify document processing completes successfully

Execution Diagram

GitHub URL

Workflow Repository