Implementation:Mlflow Mlflow GraphQL Schema Autogeneration

Knowledge Sources	Mlflow_Mlflow
Domains	Code Generation, GraphQL
Last Updated	2026-02-13 20:00 GMT

Overview

Core code generation engine that transforms collected protobuf type definitions into a Python Graphene GraphQL schema, with support for manual schema extensions via AST-based class inheritance detection.

Description

This module is the heart of the MLflow proto-to-GraphQL code generation pipeline. It takes a GenerateSchemaState object (containing collected protobuf types, enums, inputs, outputs, queries, and mutations) and produces a complete Python source file defining the Graphene GraphQL schema used by the MLflow server.

Type Mapping: The PROTO_TO_GRAPHENE_TYPE dictionary maps protobuf field descriptor types to their Graphene equivalents:

TYPE_BOOL -> graphene.Boolean
TYPE_FLOAT / TYPE_DOUBLE -> graphene.Float
TYPE_INT32 / TYPE_UINT32 / TYPE_SINT32 / TYPE_FIXED32 / TYPE_SFIXED32 -> graphene.Int
TYPE_INT64 / TYPE_UINT64 / TYPE_SINT64 / TYPE_FIXED64 / TYPE_SFIXED64 -> LongString (custom scalar)
TYPE_STRING / TYPE_BYTES -> graphene.String
TYPE_ENUM -> graphene.Enum

Manual Schema Extensions: The ClassInheritanceVisitor AST visitor parses graphql_schema_extensions.py to build an inheritance map (EXTENDED_TO_EXTENDING) from base class names to their extension class names. This allows manual overrides of autogenerated schema classes. The apply_schema_extension() function checks this map and returns the extension class reference (using dotted module path notation) when applicable.

Schema Generation (generate_schema): The generate_schema() function builds the complete schema source code by:

Generating file header with imports (graphene, mlflow, custom scalars, error types, proto JSON utils)
Generating graphene.Enum classes for each protobuf enum, with 1-indexed values
Generating graphene.ObjectType classes for each protobuf message type, with fields mapped to Graphene types; output types get an additional apiError field
Generating graphene.InputObjectType classes for input types (with "Input" suffix)
Generating QueryType(graphene.ObjectType) with query fields and resolver methods
Generating MutationType(graphene.ObjectType) with mutation fields and resolver methods

Field Type Resolution (get_graphene_type_for_field): Handles three categories of fields:

Enums - Wrapped in graphene.Field() or graphene.List(graphene.NonNull()) for repeated fields
Messages/Groups - Wrapped in graphene.Field() (output) or graphene.InputField() (input), with graphene.List() for repeated fields
Scalars - Looked up from PROTO_TO_GRAPHENE_TYPE with graphene.List() for repeated fields

Resolver Generation (generate_resolver_function): Each query and mutation gets a resolver method that:

Extracts input arguments from the GraphQL input parameter
Creates a protobuf request message from the generated pb2 module
Calls parse_dict() to populate the message from the input dictionary
Delegates to the corresponding mlflow.server.handlers implementation function

Usage

This module is invoked by dev/proto_to_graphql/code_generator.py as part of the build process. The generated output is written to mlflow/server/graphql/autogenerated_graphql_schema.py. Run uv run ./dev/proto_to_graphql/code_generator.py to regenerate the schema after protobuf changes.

Code Reference

Source Location

Repository: Mlflow_Mlflow
File: dev/proto_to_graphql/schema_autogeneration.py
Lines: 1-225

Signature

PROTO_TO_GRAPHENE_TYPE: dict[int, str]

class ClassInheritanceVisitor(ast.NodeVisitor):
    def __init__(self): ...
    inheritance_map: dict[str, str]
    def visit_ClassDef(self, node): ...

def get_manual_extensions() -> dict[str, str]: ...

EXTENDED_TO_EXTENDING: dict[str, str]

def generate_schema(state) -> str: ...
def apply_schema_extension(referenced_class_name: str) -> str: ...
def get_graphene_type_for_field(field, is_input: bool) -> str: ...
def proto_method_to_graphql_operation(method) -> str: ...
def generate_resolver_function(method) -> str: ...

Import

# Used internally by the code generator pipeline
from schema_autogeneration import generate_schema
from schema_autogeneration import PROTO_TO_GRAPHENE_TYPE

I/O Contract

Inputs

Name	Type	Required	Description
state	GenerateSchemaState	Yes	Collected protobuf definitions containing types, enums, inputs, outputs, queries, and mutations
field	FieldDescriptor	Yes	Protobuf field descriptor (for get_graphene_type_for_field)
is_input	bool	Yes	Whether the field belongs to an input type (for get_graphene_type_for_field)
method	MethodDescriptor	Yes	Protobuf service method descriptor (for resolver/operation generation)
referenced_class_name	str	Yes	Class name to check for manual extensions (for apply_schema_extension)

Outputs

Name	Type	Description
schema source	str	Complete Python source code string defining the Graphene GraphQL schema
Graphene type string	str	A Graphene type expression string for a protobuf field (e.g., "graphene.String()", "graphene.List(graphene.NonNull(SomeType))")
Resolver function string	str	Python source code for a GraphQL resolver method

Usage Examples

Generating the Schema

from schema_autogeneration import generate_schema

# state is a GenerateSchemaState populated by the code generator
schema_source = generate_schema(state)

# Write to the autogenerated schema file
with open("mlflow/server/graphql/autogenerated_graphql_schema.py", "w") as f:
    f.write(schema_source)

Generated Resolver Example

# Example of what generate_resolver_function produces:
def resolve_mlflowGetExperiment(self, info, input):
    input_dict = vars(input)
    request_message = mlflow.protos.service_pb2.GetExperiment()
    parse_dict(input_dict, request_message)
    return mlflow.server.handlers.get_experiment_impl(request_message)

Type Mapping Example

from schema_autogeneration import PROTO_TO_GRAPHENE_TYPE
from google.protobuf.descriptor import FieldDescriptor

# Lookup Graphene type for a protobuf field type
graphene_type = PROTO_TO_GRAPHENE_TYPE[FieldDescriptor.TYPE_STRING]
# Returns "graphene.String"

graphene_type = PROTO_TO_GRAPHENE_TYPE[FieldDescriptor.TYPE_INT64]
# Returns "LongString" (custom scalar for 64-bit integers)

Related Pages

Environment:Mlflow_Mlflow_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment