Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Mage ai Mage ai Record Validation and Batching

From Leeroopedia


Knowledge Sources
Domains Data_Integration, Data_Quality, Batch_Processing
Last Updated 2026-02-09 00:00 GMT

Overview

A record validation and batch accumulation mechanism that validates individual records against registered schemas, prepares data types, and accumulates records for batch export.

Description

Record Validation and Batching bridges the gap between raw Singer RECORD messages and the export layer. Each record is validated against the stream's registered JSON Schema, array/object column values are parsed from strings, override columns are injected, and records are extracted to only include properties defined in the schema. In batch mode, validated records are accumulated per-stream until a flush is triggered; in immediate mode, each record is exported individually.

Usage

Applied automatically for every RECORD message arriving at the destination. Requires that process_schema has already been called for the stream.

Theoretical Basis

Record processing pipeline:

  1. Extract: Filter record to schema properties only (if catalog provided)
  2. Parse arrays: Convert string-encoded arrays/objects to native Python types
  3. Inject overrides: Add STREAM_OVERRIDE_SETTINGS_COLUMNS values
  4. Validate: Run Draft4Validator on each column value (unless disable_column_type_check)
  5. Accumulate or export: Batch mode appends to batches_by_stream; immediate mode calls export_data directly

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment