Principle:Predibase Lorax Model Registry Selection

Knowledge Sources	HuggingFace AutoModel Attention Is All You Need
Domains	Model_Architecture, Model_Serving
Last Updated	2026-02-08 02:00 GMT

Overview

A factory pattern that selects and instantiates the appropriate model architecture class based on configuration metadata, supporting 20+ transformer variants with quantization and sharding options.

Description

Model Registry Selection solves the problem of supporting diverse model architectures (Llama, Mistral, Gemma, Qwen, etc.) behind a single unified interface. Instead of requiring users to know which class to instantiate, a factory function inspects the model configuration (model_type field from HuggingFace config) and maps it to the correct implementation class.

This decouples the model ID from the implementation, allowing new architectures to be added without changing the serving interface.

Usage

Use this principle when initializing a LoRAX inference server with a specific base model. The factory pattern is invoked once at server startup and determines which model class handles all subsequent inference requests.

Theoretical Basis

The factory pattern follows a type-dispatch approach:

Pseudo-code:

# Abstract factory pattern
config = load_config(model_id)        # Fetch HuggingFace config
model_type = config["model_type"]     # e.g., "llama", "mistral"
dtype = resolve_dtype(config, user_dtype)  # float16/bfloat16
ModelClass = registry[model_type]     # Lookup in architecture map
model = ModelClass(model_id, quantize, dtype, ...)
return model

The registry maps model_type strings to concrete Model subclasses. Each subclass knows how to load its specific architecture (attention patterns, MLP structure, embedding scheme).

Related Pages

Implemented By

Implementation:Predibase_Lorax_Get_Model_Factory

Uses Heuristic

Heuristic:Predibase_Lorax_Quantization_Backend_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment