Principle:Nautechsystems Nautilus trader Execution Reconciliation
| Field | Value |
|---|---|
| sources | https://github.com/nautechsystems/nautilus_trader, https://nautilustrader.io/docs/ |
| domains | algorithmic trading, execution management, state reconciliation, fault tolerance |
| last_updated | 2026-02-10 12:00 GMT |
Overview
Execution reconciliation is the process of comparing the local execution state (orders, fills, positions) maintained by a trading system with the authoritative state reported by the exchange, detecting and resolving any discrepancies caused by network failures, process restarts, or missed events.
Description
In live trading, the local system's view of order and position state can diverge from the exchange's truth for many reasons:
- The trading process crashed or was restarted, missing fill or status-change events.
- A WebSocket connection dropped during a critical event window.
- Network latency caused an order acknowledgement to arrive after a timeout.
- Another system or manual intervention modified orders on the same account.
The Execution Reconciliation principle defines a systematic approach to detecting and correcting these divergences. It operates at two distinct time scales:
- Startup reconciliation -- When the trading system starts (or reconnects), it requests a comprehensive execution mass status from each connected exchange client, covering open orders, recent fills, and position snapshots. It then compares this external state against the locally cached state and generates synthetic events (accepted, filled, canceled, expired) to bring the local state into alignment.
- Continuous reconciliation -- While running, the system periodically polls for open order status and position discrepancies. Orders that are open locally but missing at the venue are retried before being resolved as canceled. Position quantity mismatches trigger lookups for missing fill reports. In-flight orders that exceed a time threshold are checked for stale status.
Supporting the reconciliation process is a retry mechanism with exponential backoff and jitter, ensuring that transient API failures do not cause premature reconciliation failures or account bans from excessive polling.
Usage
This principle is critical for any trading system that:
- Operates in production where network or process failures are expected.
- Must maintain accurate position tracking for P&L reporting and risk management.
- Runs multiple instances or shares an exchange account with other systems.
- Deploys strategies that depend on knowing the precise state of in-flight orders.
Theoretical Basis
Eventual Consistency Model
A live trading system operates under an eventual consistency model: the exchange is the single source of truth, and the local system's state will eventually converge with it. Reconciliation is the mechanism that drives convergence. The key insight is that the system must be able to reconstruct its entire execution state from exchange reports alone, using the local cache as an optimisation rather than a requirement.
Startup Reconciliation Flow
System starts
|
v
Request ExecutionMassStatus from each client
|
v
Receive: OrderStatusReports, FillReports, PositionStatusReports
|
+-- For each OrderStatusReport:
| IF order exists locally AND states match --> no action
| IF order exists locally AND states differ --> generate bridging events
| IF order NOT in local cache --> generate OrderInitialized + bridging events
|
+-- For each FillReport:
| IF fill already applied locally --> skip
| IF fill is new --> generate OrderFilled event
|
+-- For each PositionStatusReport:
IF position matches local --> no action
IF quantity differs --> generate inferred fills to close the gap
Continuous Reconciliation Checks
Periodic tasks (configurable intervals):
1. In-flight order check (every inflight_check_interval_ms):
- For orders in SUBMITTED/ACCEPTED state exceeding inflight_check_threshold_ms
- Request individual order status report from exchange
- Reconcile state if divergent
2. Open order check (every open_check_interval_secs):
- Request all open orders from exchange
- Compare with locally cached open orders
- Missing at exchange: retry N times, then resolve as canceled
- Present at exchange but closed locally: generate bridging events
3. Position check (every position_check_interval_secs):
- Compare local position quantities with exchange position reports
- If discrepancy: query recent fill reports and apply missing fills
Retry with Exponential Backoff
Network operations during reconciliation (status queries, fill lookups) use a retry manager that implements exponential backoff with jitter:
delay = min(delay_max, delay_initial * backoff_factor ^ (attempt - 1))
if jitter:
delay = random(delay_initial, delay)
Retry loop:
attempt 1: execute function
on failure: wait delay, increment attempt
attempt 2: execute function
on failure: wait delay, increment attempt
...
attempt N (max_retries): execute function
on failure: log error, return None
This prevents both thundering-herd problems on reconnection and API rate-limit exhaustion from aggressive polling.
Event Generation for State Bridging
When reconciliation discovers a discrepancy, it does not directly mutate the cache. Instead, it generates the appropriate domain events (e.g., OrderAccepted, OrderFilled, OrderCanceled) and publishes them through the normal event pipeline. This ensures that:
- All downstream components (strategies, portfolio, risk engine) receive consistent notifications.
- The event audit log remains complete and chronologically ordered.
- The cache update logic is exercised through a single, well-tested path.
Cache Purging for Long-Running Systems
For high-frequency trading systems that generate large numbers of orders, the execution engine supports periodic purging of closed orders and positions from the in-memory cache (with configurable buffer periods), preventing unbounded memory growth while optionally preserving database records for post-trade analysis.