Principle:Datajuicer Data juicer Operator Instantiation
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Design_Patterns |
| Last Updated | 2026-02-14 17:00 GMT |
Overview
A registry-based dynamic instantiation pattern that converts declarative operator specifications into executable processing objects.
Description
Operator Instantiation bridges the gap between a declarative configuration (a list of operator names and their parameters) and the actual Python objects that process data. It uses a Registry pattern where each operator class self-registers under a name, and the instantiation function looks up each name in the registry, passes the configured parameters, and returns a list of ready-to-execute operator instances. This eliminates the need for hard-coded import chains and enables extensibility through custom operators.
Usage
Use this principle after configuration parsing to transform the process list from the config into live operator objects. This is required before any data processing step, as operators must be instantiated before they can be applied to datasets.
Theoretical Basis
# Abstract algorithm (NOT real implementation)
operators = []
for op_spec in config.process_list:
op_name = get_name(op_spec)
op_params = get_params(op_spec)
# Look up in registry (O(1) dict lookup)
op_class = REGISTRY[op_name]
# Instantiate with configured parameters
op_instance = op_class(**op_params)
operators.append(op_instance)
return operators
The Registry pattern provides O(1) lookup and decouples configuration from class hierarchy, making it trivial to add new operators without modifying existing code.