Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Infiniflow Ragflow Docker Deployment

From Leeroopedia
Knowledge Sources
Domains DevOps, Docker, Deployment
Last Updated 2026-02-12 06:00 GMT

Overview

End-to-end process for deploying RAGFlow using Docker Compose, including infrastructure services setup, environment configuration, server startup, and verification of all system components.

Description

This workflow covers the complete deployment of RAGFlow using Docker. RAGFlow is a multi-service application that requires several infrastructure components: MySQL for metadata storage, Elasticsearch (or Infinity/OpenSearch) for document search and vector storage, Redis for caching and task queuing, and MinIO for object storage. The deployment uses Docker Compose to orchestrate all services with proper networking, volume management, and health checks. The workflow covers prerequisites verification, environment configuration, service startup, LLM provider configuration, and deployment verification. It also covers optional configurations like GPU acceleration, document engine switching, and Kubernetes deployment via Helm charts.

Usage

Execute this workflow when setting up a new RAGFlow instance for development, testing, or production use. This is the first workflow any new RAGFlow user or operator must complete before using any other RAGFlow features. It is also needed when upgrading to a new version or changing deployment configuration.

Execution Steps

Step 1: Verify Prerequisites

Ensure the host system meets the minimum requirements for running RAGFlow. This includes hardware requirements (CPU, RAM, disk), software dependencies (Docker, Docker Compose), and kernel parameter configuration for Elasticsearch.

Requirements:

  • CPU >= 4 cores
  • RAM >= 16 GB
  • Disk >= 50 GB
  • Docker >= 24.0.0 and Docker Compose >= v2.26.1
  • vm.max_map_count >= 262144 (for Elasticsearch)
  • Optional: gVisor for code execution sandbox

Key considerations:

  • The vm.max_map_count kernel parameter must be set persistently in /etc/sysctl.conf
  • Docker images are built for x86 platforms only; ARM64 requires building from source
  • gVisor (runsc) is only needed if you plan to use the code executor agent feature

Step 2: Configure Environment

Clone the repository and configure environment variables. The primary configuration file is docker/.env which controls ports, passwords, image versions, and feature flags. The service_conf.yaml.template defines backend service connections and is auto-populated with environment variables at container startup.

Key configuration files:

  • docker/.env - System-level settings (HTTP port, MySQL/MinIO passwords, image tag, DOC_ENGINE, DEVICE)
  • docker/service_conf.yaml.template - Backend service configuration template
  • docker/docker-compose.yml - Main Docker Compose service definitions
  • docker/docker-compose-base.yml - Infrastructure-only services for development

Key considerations:

  • Set RAGFLOW_IMAGE to the desired version tag
  • Configure MYSQL_PASSWORD and MINIO_PASSWORD for security
  • Set DOC_ENGINE to elasticsearch (default) or infinity for alternative document store
  • Set DEVICE to gpu for GPU-accelerated document processing
  • Configure user_default_llm in service_conf.yaml.template with your preferred LLM provider

Step 3: Start Infrastructure Services

Launch the infrastructure services using Docker Compose. This starts MySQL, Elasticsearch (or the configured document engine), Redis, and MinIO. Each service has health checks configured to ensure proper startup before dependent services begin.

What happens:

  • MySQL initializes the database schema on first run
  • Elasticsearch creates indexes with configured mappings
  • Redis starts the caching and message queue service
  • MinIO initializes the object storage buckets
  • Health checks verify each service is responding before proceeding

Step 4: Start RAGFlow Server

Launch the RAGFlow application container which runs the entrypoint.sh script to start all application components. The entrypoint orchestrates the startup of the web server (Flask API behind Nginx), task executor workers, optional data sync service, optional MCP server, and optional admin server.

What happens:

  • service_conf.yaml is generated from the template with environment variable substitution
  • Nginx starts as the reverse proxy on the configured HTTP port
  • The Flask API server starts and initializes database schema, seed data, and plugins
  • Task executor workers start and begin polling for document processing tasks
  • Background threads start for progress monitoring and maintenance

Step 5: Configure LLM Provider

Set up at least one LLM provider with a valid API key. This can be done through the web UI user settings page or by pre-configuring the service_conf.yaml.template. RAGFlow supports 66+ LLM providers including OpenAI, Anthropic, Google, Azure, local models via Ollama, and many more. Both chat models and embedding models need to be configured.

Key considerations:

  • At minimum, configure a chat model and an embedding model
  • API keys can be set per-user through the settings UI or system-wide via configuration
  • Different providers offer different model capabilities (embedding dimensions, context windows, etc.)
  • Local model providers like Ollama can be used for air-gapped deployments

Step 6: Verify Deployment

Confirm the deployment is working correctly by checking service health, accessing the web UI, creating a test knowledge base, and running a test chat. Monitor logs for any errors during startup.

Verification steps:

  • Check container logs for the RAGFlow ASCII banner and successful startup message
  • Access the web UI at http://HOST:PORT
  • Register a user account or log in
  • Verify LLM provider configuration in user settings
  • Create a test knowledge base and upload a sample document
  • Verify document processing completes successfully

Execution Diagram

GitHub URL

Workflow Repository