Principle:Confident ai Deepeval Exception Hierarchy Design
| Knowledge Sources | |
|---|---|
| Domains | Error_Handling, Software_Architecture |
| Last Updated | 2026-02-14 09:30 GMT |
Overview
Principle of designing a structured exception hierarchy that distinguishes error origins (framework vs. user code) and error types (validation vs. runtime), enabling granular error handling in evaluation pipelines.
Description
Exception Hierarchy Design is the practice of creating a tree of custom exception classes that encode where an error originated and what kind of error occurred. In an evaluation framework context, this distinction is critical because:
- Framework errors (bugs, misconfiguration) should abort the operation and surface to the developer.
- User application errors (failures in the LLM app being tested) should be recorded as observations but not terminate the evaluation run.
- Validation errors (missing fields, mismatched inputs) should provide specific, actionable error messages.
The hierarchy uses two independent base classes rather than a single root, because framework errors and user errors require fundamentally different handling strategies. Framework errors propagate up the call stack, while user errors are caught at the evaluation boundary and attached to trace/span records.
This design enables "resilient evaluation" mode where a batch of test cases can be processed even when some user applications fail, with failures recorded alongside successful results.
Usage
Apply this principle when building evaluation or testing frameworks that execute user-provided code alongside framework logic. The separation of error origins allows the framework to implement error boundaries that protect the evaluation pipeline from user code failures.
Theoretical Basis
The exception hierarchy follows the discriminated error pattern:
# Abstract hierarchy design (NOT real implementation)
Exception
├── FrameworkError # Aborts operation, indicates framework bug
│ ├── ValidationError # Specific: missing/invalid inputs
│ └── ConfigError # Specific: misconfiguration
└── UserAppError # Recorded but non-fatal to the pipeline
The key design decision is that `UserAppError` does not inherit from `FrameworkError`. This means a `catch FrameworkError` block will never accidentally swallow user app failures, and vice versa. The evaluation engine's error boundary uses this separation:
# Abstract error boundary logic (NOT real implementation)
try:
result = execute_user_app(test_case)
except UserAppError as e:
record_on_trace(e) # non-fatal: continue evaluation
except FrameworkError as e:
propagate(e) # fatal: abort this test case