Implementation:Mlflow Mlflow GraphQL Schema Autogeneration
| Knowledge Sources | |
|---|---|
| Domains | Code Generation, GraphQL |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
Core code generation engine that transforms collected protobuf type definitions into a Python Graphene GraphQL schema, with support for manual schema extensions via AST-based class inheritance detection.
Description
This module is the heart of the MLflow proto-to-GraphQL code generation pipeline. It takes a GenerateSchemaState object (containing collected protobuf types, enums, inputs, outputs, queries, and mutations) and produces a complete Python source file defining the Graphene GraphQL schema used by the MLflow server.
Type Mapping: The PROTO_TO_GRAPHENE_TYPE dictionary maps protobuf field descriptor types to their Graphene equivalents:
- TYPE_BOOL -> graphene.Boolean
- TYPE_FLOAT / TYPE_DOUBLE -> graphene.Float
- TYPE_INT32 / TYPE_UINT32 / TYPE_SINT32 / TYPE_FIXED32 / TYPE_SFIXED32 -> graphene.Int
- TYPE_INT64 / TYPE_UINT64 / TYPE_SINT64 / TYPE_FIXED64 / TYPE_SFIXED64 -> LongString (custom scalar)
- TYPE_STRING / TYPE_BYTES -> graphene.String
- TYPE_ENUM -> graphene.Enum
Manual Schema Extensions: The ClassInheritanceVisitor AST visitor parses graphql_schema_extensions.py to build an inheritance map (EXTENDED_TO_EXTENDING) from base class names to their extension class names. This allows manual overrides of autogenerated schema classes. The apply_schema_extension() function checks this map and returns the extension class reference (using dotted module path notation) when applicable.
Schema Generation (generate_schema): The generate_schema() function builds the complete schema source code by:
- Generating file header with imports (graphene, mlflow, custom scalars, error types, proto JSON utils)
- Generating graphene.Enum classes for each protobuf enum, with 1-indexed values
- Generating graphene.ObjectType classes for each protobuf message type, with fields mapped to Graphene types; output types get an additional apiError field
- Generating graphene.InputObjectType classes for input types (with "Input" suffix)
- Generating QueryType(graphene.ObjectType) with query fields and resolver methods
- Generating MutationType(graphene.ObjectType) with mutation fields and resolver methods
Field Type Resolution (get_graphene_type_for_field): Handles three categories of fields:
- Enums - Wrapped in graphene.Field() or graphene.List(graphene.NonNull()) for repeated fields
- Messages/Groups - Wrapped in graphene.Field() (output) or graphene.InputField() (input), with graphene.List() for repeated fields
- Scalars - Looked up from PROTO_TO_GRAPHENE_TYPE with graphene.List() for repeated fields
Resolver Generation (generate_resolver_function): Each query and mutation gets a resolver method that:
- Extracts input arguments from the GraphQL input parameter
- Creates a protobuf request message from the generated pb2 module
- Calls parse_dict() to populate the message from the input dictionary
- Delegates to the corresponding mlflow.server.handlers implementation function
Usage
This module is invoked by dev/proto_to_graphql/code_generator.py as part of the build process. The generated output is written to mlflow/server/graphql/autogenerated_graphql_schema.py. Run uv run ./dev/proto_to_graphql/code_generator.py to regenerate the schema after protobuf changes.
Code Reference
Source Location
- Repository: Mlflow_Mlflow
- File: dev/proto_to_graphql/schema_autogeneration.py
- Lines: 1-225
Signature
PROTO_TO_GRAPHENE_TYPE: dict[int, str]
class ClassInheritanceVisitor(ast.NodeVisitor):
def __init__(self): ...
inheritance_map: dict[str, str]
def visit_ClassDef(self, node): ...
def get_manual_extensions() -> dict[str, str]: ...
EXTENDED_TO_EXTENDING: dict[str, str]
def generate_schema(state) -> str: ...
def apply_schema_extension(referenced_class_name: str) -> str: ...
def get_graphene_type_for_field(field, is_input: bool) -> str: ...
def proto_method_to_graphql_operation(method) -> str: ...
def generate_resolver_function(method) -> str: ...
Import
# Used internally by the code generator pipeline
from schema_autogeneration import generate_schema
from schema_autogeneration import PROTO_TO_GRAPHENE_TYPE
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| state | GenerateSchemaState | Yes | Collected protobuf definitions containing types, enums, inputs, outputs, queries, and mutations |
| field | FieldDescriptor | Yes | Protobuf field descriptor (for get_graphene_type_for_field) |
| is_input | bool | Yes | Whether the field belongs to an input type (for get_graphene_type_for_field) |
| method | MethodDescriptor | Yes | Protobuf service method descriptor (for resolver/operation generation) |
| referenced_class_name | str | Yes | Class name to check for manual extensions (for apply_schema_extension) |
Outputs
| Name | Type | Description |
|---|---|---|
| schema source | str | Complete Python source code string defining the Graphene GraphQL schema |
| Graphene type string | str | A Graphene type expression string for a protobuf field (e.g., "graphene.String()", "graphene.List(graphene.NonNull(SomeType))") |
| Resolver function string | str | Python source code for a GraphQL resolver method |
Usage Examples
Generating the Schema
from schema_autogeneration import generate_schema
# state is a GenerateSchemaState populated by the code generator
schema_source = generate_schema(state)
# Write to the autogenerated schema file
with open("mlflow/server/graphql/autogenerated_graphql_schema.py", "w") as f:
f.write(schema_source)
Generated Resolver Example
# Example of what generate_resolver_function produces:
def resolve_mlflowGetExperiment(self, info, input):
input_dict = vars(input)
request_message = mlflow.protos.service_pb2.GetExperiment()
parse_dict(input_dict, request_message)
return mlflow.server.handlers.get_experiment_impl(request_message)
Type Mapping Example
from schema_autogeneration import PROTO_TO_GRAPHENE_TYPE
from google.protobuf.descriptor import FieldDescriptor
# Lookup Graphene type for a protobuf field type
graphene_type = PROTO_TO_GRAPHENE_TYPE[FieldDescriptor.TYPE_STRING]
# Returns "graphene.String"
graphene_type = PROTO_TO_GRAPHENE_TYPE[FieldDescriptor.TYPE_INT64]
# Returns "LongString" (custom scalar for 64-bit integers)