Principle:Nautechsystems Nautilus trader Config Driven Backtesting

Field	Value
sources	https://github.com/nautechsystems/nautilus_trader, https://nautilustrader.io/docs/
domains	backtesting, orchestration, configuration-management, scenario-management
last_updated	2026-02-10 12:00 GMT

Overview

Config-Driven Backtesting is the principle of orchestrating one or more complete backtest simulations entirely from declarative configuration objects, where a single orchestrator node consumes a list of run configurations and produces a corresponding list of results without requiring imperative engine setup code.

Description

Traditional backtesting workflows require users to write imperative code that manually creates an engine, adds venues, loads data, registers strategies, and calls run(). This approach is error-prone, difficult to parameterize, and does not scale to multi-scenario experiments.

Config-Driven Backtesting solves this by establishing a layered configuration hierarchy:

Run configuration -- the top-level object that combines venue configurations, data configurations, and an engine configuration into a single, self-contained backtest specification.
Engine configuration -- the system kernel settings (trader ID, log level, sub-engine configs) and the list of importable strategy/actor configurations.
Venue configurations -- one or more simulated exchange definitions (see Venue Configuration Schema).
Data configurations -- one or more data source definitions (see Data Configuration Schema).
Strategy configurations -- importable strategy specifications (see Importable Strategy Configuration).

An orchestration node accepts a list of run configurations and, for each one:

Validates the configuration (checks venue/data consistency, time range validity).
Builds a backtest engine with the specified kernel settings.
Adds venues based on each venue configuration.
Loads instruments from the data catalog.
Loads data (either all-at-once or in streaming chunks).
Runs the simulation.
Collects results and optionally disposes the engine.

This approach yields several benefits:

Multi-scenario orchestration -- A single node can execute dozens of backtest runs with different parameter sets, strategies, or data windows.
Reproducibility -- The entire experiment is defined by serializable configuration objects.
Error isolation -- Each run configuration can specify whether exceptions should propagate or be logged and skipped.
Streaming support -- Large datasets can be processed in chunks via a configurable chunk_size, preventing memory exhaustion.
Post-run analysis -- The node retains access to engines and results for inspection after the run completes.

Usage

Use Config-Driven Backtesting whenever you need to:

Run a single backtest without writing engine setup boilerplate.
Execute a parameter sweep across multiple strategies, instruments, or time windows.
Orchestrate batch backtesting from configuration files (JSON, YAML, etc.).
Process large datasets via streaming mode with configurable chunk sizes.
Build automated backtesting pipelines in CI/CD or research environments.

Theoretical Basis

Config-Driven Backtesting follows the Orchestrator pattern, where a central coordinator consumes declarative job descriptions and manages their execution lifecycle.

Pseudocode for config-driven backtest orchestration:

BacktestRunConfig:
    venues  : list[VenueConfig]
    data    : list[DataConfig]
    engine  : EngineConfig   # Includes strategies, actors, kernel settings
    chunk_size : int | None  # None = load all at once; int = streaming
    start   : datetime | None
    end     : datetime | None
    raise_exception : bool
    dispose_on_completion : bool

BacktestNode:
    configs : list[BacktestRunConfig]
    engines : map[config_id -> BacktestEngine]

    FUNCTION run() -> list[BacktestResult]:
        build()  # Create engines for all configs
        results = []

        FOR EACH config IN configs:
            TRY:
                engine = engines[config.id]
                load_data(engine, config.data, config.chunk_size)
                engine.run(start=config.start, end=config.end)
                results.append(engine.get_result())

                IF config.dispose_on_completion:
                    engine.dispose()
                ELSE:
                    engine.clear_data()

            CATCH exception:
                IF config.raise_exception:
                    RAISE exception
                ELSE:
                    LOG exception

        RETURN results

Build phase pseudocode:

FUNCTION build():
    FOR EACH config IN configs:
        engine = BacktestEngine(config.engine)

        FOR EACH venue_config IN config.venues:
            engine.add_venue(venue_config)

        FOR EACH data_config IN config.data:
            instruments = catalog.load_instruments(data_config)
            FOR EACH instrument IN instruments:
                engine.add_instrument(instrument)

        engines[config.id] = engine

Data loading modes:

IF chunk_size IS None:
    # One-shot mode: load all data, then run
    FOR EACH data_config IN config.data:
        data = catalog.query(data_config)
        engine.add_data(data)
    engine.run()

ELSE:
    # Streaming mode: load and process in chunks
    session = DataBackendSession(chunk_size)
    FOR EACH data_config IN config.data:
        session.add_query(data_config)
    FOR EACH chunk IN session:
        engine.add_data(chunk)
        engine.run(streaming=True)
        engine.clear_data()
    engine.end()

The key architectural insight is that the node does not embed any strategy-specific or venue-specific logic -- it is a generic executor that derives all behavior from the configuration objects it receives.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment