Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft LoRA Check Copies

From Leeroopedia


Template:Implementation meta

Overview

The check_copies.py utility validates that code blocks annotated with # Copied from comments in the Transformers source remain consistent with their original source, and ensures the model list in index.rst matches the README.

Description

This CI script enforces copy consistency across the Transformers codebase. It performs two main checks:

  1. Copy Consistency Check: Scans all Python files under src/transformers/ for lines matching the pattern # Copied from transformers.<module>.<object>. For each match, it locates the original source code, applies any replacement patterns specified (e.g., with ClassName->OtherClassName), and compares the copied code against the original. When --fix_and_overwrite is provided, it auto-corrects divergent copies.
  2. Model List Synchronization: Extracts the model list from README.md, converts it from Markdown to RST format, and verifies it matches the corresponding list in docs/source/index.rst. Supports auto-fixing with overwrite mode.

Key internal functions include:

  • find_code_in_transformers(object_name): Locates a class or function in the Transformers source by dotted path and returns its source code.
  • blackify(code): Formats code with Black (line length 119, target Python 3.5).
  • is_copy_consistent(filename, overwrite): Checks a single file for copy consistency.
  • convert_to_rst(model_list, max_per_line): Converts Markdown model list entries to RST format with proper link conversion and line wrapping.

Usage

Use this utility when:

  • Running CI checks to ensure annotated code copies remain in sync with their originals.
  • Verifying that the model list in documentation matches the README after adding or updating models.
  • Auto-fixing copy drift using the --fix_and_overwrite flag or make fix-copies.

Code Reference

Source Location

examples/NLU/utils/check_copies.py (324 lines)

Signature

def find_code_in_transformers(object_name: str) -> str: ...
def blackify(code: str) -> str: ...
def get_indent(code: str) -> str: ...
def is_copy_consistent(filename: str, overwrite: bool = False) -> list: ...
def check_copies(overwrite: bool = False) -> None: ...
def get_model_list() -> str: ...
def split_long_line_with_indent(line: str, max_per_line: int, indent: int) -> str: ...
def convert_to_rst(model_list: str, max_per_line: int = None) -> str: ...
def check_model_list_copy(overwrite: bool = False, max_per_line: int = 119) -> None: ...

Import / CLI Usage

# Run from repository root
python utils/check_copies.py

# Auto-fix inconsistencies
python utils/check_copies.py --fix_and_overwrite

# Or via Makefile
make fix-copies

I/O Contract

Inputs

Input Type Description
--fix_and_overwrite CLI flag When set, overwrites inconsistent copies instead of raising an error
src/transformers/**/*.py Files All Python files scanned for # Copied from annotations
README.md File Source of the canonical model list in Markdown format
docs/source/index.rst File RST documentation file that should mirror the README model list

Outputs

Output Type Description
Exception (check mode) Exception Raised with details of all copy inconsistencies found
Overwritten files (fix mode) Files Python files and index.rst updated to match their originals
Console output stdout Messages indicating which files were rewritten

Usage Examples

# Check for copy consistency (CI mode, raises on failure)
python utils/check_copies.py

# Auto-fix all copy inconsistencies
python utils/check_copies.py --fix_and_overwrite

# Programmatic usage
from check_copies import is_copy_consistent

diffs = is_copy_consistent("src/transformers/models/bert/modeling_bert.py")
if diffs:
    for diff in diffs:
        print(f"Mismatch: {diff[0]} at line {diff[1]}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment