Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlflow Mlflow Clint Symbol Index

From Leeroopedia
Revision as of 13:17, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Mlflow_Mlflow_Clint_Symbol_Index.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Static Analysis, Code Linting
Last Updated 2026-02-13 20:00 GMT

Overview

Symbol indexing module for the Clint custom linter that builds and maintains an index of all MLflow Python functions and classes, enabling cross-module symbol resolution through import chain traversal.

Description

This module provides efficient indexing and lookup of Python symbols (functions, classes) across the entire MLflow codebase using AST parsing and parallel processing. It is a core component of the Clint custom linter, enabling lint rules that require cross-module knowledge of function signatures.

Key Components:

FunctionInfo is a lightweight dataclass that stores function signature information:

  • has_vararg - Whether the function accepts *args
  • has_kwarg - Whether the function accepts **kwargs
  • args - List of regular argument names
  • kwonlyargs - List of keyword-only argument names
  • posonlyargs - List of positional-only argument names
  • from_func_def() - Class method that constructs a FunctionInfo from an AST FunctionDef or AsyncFunctionDef node, with an option to skip the self parameter for methods
  • all_args - Property that returns all argument names combined

ModuleSymbolExtractor is an AST NodeVisitor that extracts two kinds of information from a Python module:

  • import_mapping - Maps re-exported names to their original fully-qualified names (e.g., mlflow.log_metric -> mlflow.tracking.fluent.log_metric)
  • func_mapping - Maps fully-qualified function/class names to their FunctionInfo signatures
  • For classes, it extracts the __init__ signature and any @classmethod or @staticmethod methods. Classes without __init__ are recorded with *args, **kwargs.

extract_symbols_from_file() is a standalone function that parses a single file and returns its import and function mappings. It converts file paths to module names (e.g., mlflow/tracking/fluent.py -> mlflow.tracking.fluent).

SymbolIndex is the main index class that:

  • build() - Constructs the index by parallel-processing all mlflow/*.py files using ProcessPoolExecutor, listing them via git ls-files
  • resolve() - Resolves a fully-qualified symbol name to its FunctionInfo by first checking direct function mappings, then following import chains with circular import detection
  • save() / load() - Pickle serialization for efficient sharing between worker processes

Usage

The SymbolIndex is used by Clint lint rules such as unknown-mlflow-function and unknown-mlflow-arguments that need to verify whether a function exists in the MLflow API and whether the correct arguments are being passed. It is built once and shared across all rule checks.

Code Reference

Source Location

Signature

@dataclass
class FunctionInfo:
    has_vararg: bool
    has_kwarg: bool
    args: list[str] = field(default_factory=list)
    kwonlyargs: list[str] = field(default_factory=list)
    posonlyargs: list[str] = field(default_factory=list)

    @classmethod
    def from_func_def(
        cls, node: ast.FunctionDef | ast.AsyncFunctionDef, skip_self: bool = False
    ) -> Self: ...

    @property
    def all_args(self) -> list[str]: ...

class ModuleSymbolExtractor(ast.NodeVisitor):
    def __init__(self, mod: str) -> None: ...
    def visit_Import(self, node: ast.Import) -> None: ...
    def visit_ImportFrom(self, node: ast.ImportFrom) -> None: ...
    def visit_FunctionDef(self, node: ast.FunctionDef) -> None: ...
    def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None: ...
    def visit_ClassDef(self, node: ast.ClassDef) -> None: ...

def extract_symbols_from_file(
    rel_path: str, content: str
) -> tuple[dict[str, str], dict[str, FunctionInfo]] | None: ...

class SymbolIndex:
    def __init__(self, import_mapping: dict[str, str], func_mapping: dict[str, FunctionInfo]) -> None: ...
    def save(self, path: Path) -> None: ...
    @classmethod
    def load(cls, path: Path) -> Self: ...
    @classmethod
    def build(cls) -> Self: ...
    def resolve(self, target: str) -> FunctionInfo | None: ...

Import

from clint.index import SymbolIndex, FunctionInfo
from clint.index import ModuleSymbolExtractor, extract_symbols_from_file

I/O Contract

Inputs

Name Type Required Description
rel_path str Yes Relative file path from repository root (e.g., "mlflow/tracking/fluent.py")
content str Yes Python source code content of the file to parse
target str Yes Fully-qualified symbol name to resolve (e.g., "mlflow.log_metric")
path Path Yes File path for saving/loading the pickled index

Outputs

Name Type Description
FunctionInfo dataclass Lightweight function signature with argument lists and vararg/kwarg flags
SymbolIndex class Complete index of all MLflow symbols with resolution capabilities
import_mapping dict[str, str] Mapping from re-exported names to their original fully-qualified module paths
func_mapping dict[str, FunctionInfo] Mapping from fully-qualified function names to their signature information

Usage Examples

Building and Using the Symbol Index

from clint.index import SymbolIndex

# Build an index of all MLflow symbols
index = SymbolIndex.build()

# Resolve a function's signature
func_info = index.resolve("mlflow.log_metric")
if func_info:
    print(f"Arguments: {func_info.args}")
    print(f"Has **kwargs: {func_info.has_kwarg}")

Saving and Loading the Index

from pathlib import Path
from clint.index import SymbolIndex

# Build and save for later use
index = SymbolIndex.build()
index.save(Path("/tmp/mlflow_symbol_index.pkl"))

# Load from cache
cached_index = SymbolIndex.load(Path("/tmp/mlflow_symbol_index.pkl"))

Extracting Symbols from a Single File

from clint.index import extract_symbols_from_file

source_code = open("mlflow/tracking/fluent.py").read()
result = extract_symbols_from_file("mlflow/tracking/fluent.py", source_code)
if result:
    imports, functions = result
    for name, info in functions.items():
        print(f"{name}: args={info.args}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment