Principle:Treeverse LakeFS Action Hook Definition
| Knowledge Sources | |
|---|---|
| Domains | Data_Quality, Automation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Action hooks are event-driven functions that execute automatically on repository events, enabling declarative data quality gates and automated validation as part of the data lifecycle.
Description
lakeFS action hooks provide a mechanism for defining event-driven data quality gates using declarative YAML configuration files. These hooks intercept repository operations at well-defined lifecycle points and execute custom validation logic before or after the operation completes.
There are two fundamental hook types:
- webhook -- HTTP callbacks that invoke external services. When a triggering event occurs, lakeFS sends an HTTP POST request to the configured URL. The external service processes the request and returns a response indicating success or failure.
- lua -- Server-side Lua scripts that execute within the lakeFS process. These scripts have access to the lakeFS API and can perform validation, transformation, and notification logic without requiring an external service.
Action hooks are configured through YAML files stored under the reserved _lakefs_actions/ path prefix within a repository. This means hook definitions are versioned alongside data -- they can be branched, committed, merged, and reviewed just like any other object in the repository.
The supported repository events are:
- pre-commit / post-commit -- Triggered before/after a commit operation
- pre-merge / post-merge -- Triggered before/after a merge operation
- pre-create-branch / post-create-branch -- Triggered before/after branch creation
- pre-delete-branch / post-delete-branch -- Triggered before/after branch deletion
- pre-create-tag / post-create-tag -- Triggered before/after tag creation
- pre-delete-tag / post-delete-tag -- Triggered before/after tag deletion
Pre-event hooks act as gates: if they fail (webhook returns non-200, Lua script errors), the triggering operation is rejected entirely. Post-event hooks act as notifications: they fire after the operation succeeds and do not affect the operation outcome.
Usage
Use action hook definitions when you need to:
- Enforce data quality -- Validate schema conformance, check for null values, verify data freshness before commits or merges are finalized
- Implement governance policies -- Require metadata tags, enforce naming conventions, validate access patterns
- Automate notifications -- Trigger downstream pipelines, send Slack alerts, update external catalogs after successful operations
- Create promotion workflows -- Gate merges to production branches on automated quality checks, creating a CI/CD-like pipeline for data
Theoretical Basis
The action hook model draws from two established software engineering patterns:
Event-driven architecture: Rather than requiring manual invocation of validation scripts, hooks subscribe to repository lifecycle events and execute automatically. This ensures validation is never bypassed -- every commit and merge passes through the configured gates regardless of which client or user initiated the operation.
Declarative configuration: Hook behavior is defined through YAML configuration rather than imperative code. This separates the what (which events to intercept, which validations to run) from the how (the validation logic itself). The declarative approach makes hook configurations auditable, reviewable, and version-controlled.
Git-like hooks, lake-scale: Similar to Git pre-commit hooks, lakeFS action hooks intercept operations before they finalize. However, unlike Git hooks (which are local and client-side), lakeFS hooks are server-side and repository-scoped -- they apply to all clients and cannot be bypassed by individual users.
The combination of pre-event gating and post-event notification creates a complete observe-validate-act loop around every significant repository operation.