Principle:Spotify Luigi Documentation Build
| Knowledge Sources | |
|---|---|
| Domains | Documentation, Build_System |
| Last Updated | 2026-02-10 08:00 GMT |
Overview
Automated generation of project documentation from source code and structured markup to keep documentation synchronized with the codebase.
Description
Documentation build is the practice of using automated tools to generate comprehensive project documentation from a combination of source code annotations (docstrings, type hints, comments) and hand-written markup files (reStructuredText, Markdown). Rather than maintaining documentation as a separate artifact that easily falls out of sync with the code, the documentation build system extracts API documentation directly from the source, combines it with narrative guides and tutorials, and produces a unified output in a publishable format (HTML, PDF, ePub). This ensures that documentation accurately reflects the current state of the code, reduces duplication of effort, and establishes a single source of truth for both API references and conceptual guides.
Usage
Use documentation build when the project has a public API that needs reference documentation, when the codebase contains docstrings that should be surfaced as user-facing documentation, when narrative documentation (tutorials, guides, architecture explanations) needs to be published alongside API docs, or when documentation must be generated as part of a continuous integration pipeline to ensure it stays current.
Theoretical Basis
Documentation build systems operate on a source extraction and rendering pipeline:
1. Source Discovery -- The build system identifies documentation sources: * Code sources -- Python modules, classes, and functions with docstrings * Markup sources -- Hand-written documentation files in structured formats (reStructuredText, Markdown) * Configuration -- Build configuration specifying which sources to include, theme settings, and output options 2. Introspection -- For API documentation, the system introspects the source code: * Import modules and inspect their public members * Extract docstrings, function signatures, parameter types, and return types * Resolve cross-references between documented entities (a method referencing a class, a function referencing a parameter type) 3. Parsing -- Markup sources are parsed into an abstract document tree: document -> sections -> paragraphs, code blocks, lists, tables, references The parser handles the markup language syntax (reStructuredText directives, roles, cross-references) and produces a format-independent representation. 4. Cross-Reference Resolution -- References between documents and between code and documents are resolved: * A narrative guide can reference an API class: :class:`TaskClass` * An API docstring can reference a guide section * Inter-module references are resolved across the entire documentation set 5. Transformation -- The document tree is transformed through a series of processing steps: * Applying the documentation theme (layout, styling) * Generating navigation structures (table of contents, sidebar, index) * Processing special directives (code highlighting, mathematical notation, diagrams) * Building search indices for full-text search 6. Rendering -- The transformed document tree is rendered to the target output format: * HTML -- Individual pages with navigation, search, and responsive layout * PDF -- Paginated document with table of contents and page references * ePub -- Electronic book format for offline reading 7. Extension System -- The build system supports extensions that add custom functionality: * Custom directives for domain-specific markup * Custom builders for additional output formats * Hooks into the build process for preprocessing or postprocessing
The fundamental principle is documentation as code: documentation sources live alongside the code in version control, are built by automated tools, and are validated and published through the same CI/CD pipeline as the software itself.