Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft LoRA Style Doc

From Leeroopedia


Template:Implementation meta

Overview

The style_doc.py utility enforces consistent line-length formatting and styling rules for RST documentation files and Python docstrings in the Transformers project.

Description

This script provides automated formatting for two types of documentation:

  1. RST Files: The CodeStyler class processes RST files, wrapping text paragraphs to a configurable maximum line length (default 119 characters). It respects special RST constructs:
    • Code blocks (::) are left untouched.
    • Directive blocks (.. something::) are not restyled internally.
    • Textual blocks (.. note::, .. warning::) have their content restyled with proper indentation.
    • Lists (bullet, numbered) are re-wrapped while preserving list structure.
    • Tables are detected and left as-is.
    • Title/section underlines are extended to max_len.
  1. Python Docstrings: The DocstringStyler subclass extends CodeStyler with additional awareness of:
    • Argument definition blocks (Args:, Parameters:, Attributes:) where parameter description lines are preserved while sub-descriptions are wrapped.
    • Return/Raises sections treated as comment-style blocks.
    • Example blocks (::) marked as no-style zones.
    • Special docstring words (Args, Returns, Examples, etc.) get blank lines inserted before them.

The SpecialBlock enum tracks three states: NOT_SPECIAL, NO_STYLE, and ARG_LIST, enabling the styler to switch formatting modes as it traverses document structure.

Usage

Use this utility when:

  • Enforcing documentation style in CI (check-only mode).
  • Auto-formatting RST docs and Python docstrings to meet the 119-character line length convention.
  • Preparing documentation for Sphinx builds by ensuring consistent formatting.

Code Reference

Source Location

examples/NLU/utils/style_doc.py (523 lines)

Signature

# Core classes
class SpecialBlock(Enum):
    NOT_SPECIAL = 0
    NO_STYLE = 1
    ARG_LIST = 2

class CodeStyler:
    def style(self, text: str, max_len: int = 119, min_indent: str = None) -> str: ...
    def style_paragraph(self, paragraph: list, max_len: int, no_style: bool = False, min_indent: str = None) -> str: ...

class DocstringStyler(CodeStyler): ...

# Public API
def style_rst_file(doc_file: str, max_len: int = 119, check_only: bool = False) -> bool: ...
def style_docstring(docstring: str, max_len: int = 119) -> str: ...
def style_file_docstrings(code_file: str, max_len: int = 119, check_only: bool = False) -> bool: ...
def style_doc_files(*files, max_len: int = 119, check_only: bool = False) -> list: ...
def main(*files, max_len: int = 119, check_only: bool = False) -> None: ...

# Helpers
def split_text_in_lines(text: str, max_len: int, prefix: str = "", min_indent: str = None) -> str: ...
def get_indent(line: str) -> str: ...

Import / CLI Usage

# Style specific files
python utils/style_doc.py docs/source/model_doc/bert.rst

# Style a directory recursively
python utils/style_doc.py docs/source/

# Check-only mode (for CI)
python utils/style_doc.py --check_only docs/source/

# Custom max line length
python utils/style_doc.py --max_len 100 docs/source/model_doc/bert.rst

I/O Contract

Inputs

Input Type Description
files positional args One or more file paths or directory paths to process
--max_len int (optional) Maximum line length; defaults to 119
--check_only flag If set, raise an error on needed changes instead of applying them

Outputs

Output Type Description
Restyled files Files RST and Python files overwritten with formatted content (when not in check-only mode)
ValueError Exception Raised in check-only mode if files need restyling
Console output stdout Reports which files were cleaned or how many need restyling

Usage Examples

# Auto-format all documentation
python utils/style_doc.py docs/source/ src/transformers/

# Check documentation style in CI
python utils/style_doc.py --check_only docs/source/ src/transformers/
# Raises ValueError if files need formatting

# Programmatic usage for a single docstring
from style_doc import style_docstring

raw = """
    Args:
        input_ids (torch.LongTensor): Indices of input sequence tokens in the vocabulary. These are very long descriptions that should be wrapped.
    Returns:
        torch.FloatTensor: The model output.
"""
styled = style_docstring(raw, max_len=119)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment