Principle:Predibase Lorax Model Registry Selection
| Knowledge Sources | |
|---|---|
| Domains | Model_Architecture, Model_Serving |
| Last Updated | 2026-02-08 02:00 GMT |
Overview
A factory pattern that selects and instantiates the appropriate model architecture class based on configuration metadata, supporting 20+ transformer variants with quantization and sharding options.
Description
Model Registry Selection solves the problem of supporting diverse model architectures (Llama, Mistral, Gemma, Qwen, etc.) behind a single unified interface. Instead of requiring users to know which class to instantiate, a factory function inspects the model configuration (model_type field from HuggingFace config) and maps it to the correct implementation class.
This decouples the model ID from the implementation, allowing new architectures to be added without changing the serving interface.
Usage
Use this principle when initializing a LoRAX inference server with a specific base model. The factory pattern is invoked once at server startup and determines which model class handles all subsequent inference requests.
Theoretical Basis
The factory pattern follows a type-dispatch approach:
Pseudo-code:
# Abstract factory pattern
config = load_config(model_id) # Fetch HuggingFace config
model_type = config["model_type"] # e.g., "llama", "mistral"
dtype = resolve_dtype(config, user_dtype) # float16/bfloat16
ModelClass = registry[model_type] # Lookup in architecture map
model = ModelClass(model_id, quantize, dtype, ...)
return model
The registry maps model_type strings to concrete Model subclasses. Each subclass knows how to load its specific architecture (attention patterns, MLP structure, embedding scheme).