Principle:Mage ai Mage ai Record Validation and Batching
| Knowledge Sources | |
|---|---|
| Domains | Data_Integration, Data_Quality, Batch_Processing |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
A record validation and batch accumulation mechanism that validates individual records against registered schemas, prepares data types, and accumulates records for batch export.
Description
Record Validation and Batching bridges the gap between raw Singer RECORD messages and the export layer. Each record is validated against the stream's registered JSON Schema, array/object column values are parsed from strings, override columns are injected, and records are extracted to only include properties defined in the schema. In batch mode, validated records are accumulated per-stream until a flush is triggered; in immediate mode, each record is exported individually.
Usage
Applied automatically for every RECORD message arriving at the destination. Requires that process_schema has already been called for the stream.
Theoretical Basis
Record processing pipeline:
- Extract: Filter record to schema properties only (if catalog provided)
- Parse arrays: Convert string-encoded arrays/objects to native Python types
- Inject overrides: Add STREAM_OVERRIDE_SETTINGS_COLUMNS values
- Validate: Run Draft4Validator on each column value (unless disable_column_type_check)
- Accumulate or export: Batch mode appends to batches_by_stream; immediate mode calls export_data directly