Implementation:Microsoft BIPIA AutoLLM

Overview

Concrete tool for loading LLM model classes provided by the BIPIA benchmark library.

Description

AutoLLM is a factory class that maps 21 model identifiers to their corresponding model wrapper classes. It supports three loading paths:

Direct name lookup -- The provided string matches a key in the LLM_NAME_TO_CLASS dictionary (e.g., "gpt35"), and the corresponding class is returned immediately.
YAML config file path -- The provided string ends with .yaml or .yml. The factory loads the YAML file, reads the model_name key, and resolves it through the same dictionary.
ValueError -- If the name matches neither a known key nor a valid YAML path, a ValueError is raised with a message listing the supported model names.

The factory returns a class (not an instance). The caller must then construct the returned class with the appropriate arguments, which vary by backend type (see I/O Contract below).

Usage

Import AutoLLM when you need to instantiate any supported LLM for benchmark inference within the BIPIA framework. The function accepts either a model name string or a YAML config file path and returns the appropriate model class for construction.

from bipia.model import AutoLLM

Code Reference

Source: BIPIA repo, File: bipia/model/__init__.py, Lines: L1-72

Signature:

@classmethod
def from_name(cls, name: str) -> Type[BaseModel]

The returned class constructors vary by backend:

GPTModel(config=str|dict) -- For OpenAI API-based models.
LLMModel(config=str|dict, accelerator=Accelerator) -- For HuggingFace Transformers models requiring a HuggingFace Accelerator instance.
vLLMModel(config=str|dict, tensor_parallel_size=int) -- For vLLM-accelerated models with configurable tensor parallelism.

Import:

from bipia.model import AutoLLM

LLM_NAME_TO_CLASS Mapping (21 entries):

LLM_NAME_TO_CLASS = {
    "gpt35":            GPTModel,
    "gpt4":             GPTModel,
    "gpt35_0613":       GPTModel,
    "gpt4_0613":        GPTModel,
    "gpt4_1106":        GPTModel,
    "llama2_7b":        LLMModel,
    "llama2_13b":       LLMModel,
    "llama2_70b":       LLMModel,
    "vicuna_7b":        LLMModel,
    "vicuna_13b":       LLMModel,
    "vicuna_33b":       LLMModel,
    "falcon_7b":        LLMModel,
    "falcon_40b":       LLMModel,
    "mpt_7b":           LLMModel,
    "mpt_30b":          LLMModel,
    "mistral":          vLLMModel,
    "llama2_7b_vllm":   vLLMModel,
    "llama2_13b_vllm":  vLLMModel,
    "llama2_70b_vllm":  vLLMModel,
    "vicuna_7b_vllm":   vLLMModel,
    "vicuna_13b_vllm":  vLLMModel,
}

I/O Contract

**Inputs**
Parameter	Type	Required	Description
name	str	Yes	A model name key (e.g., `"gpt35"`) or a path to a YAML config file (e.g., `"config/vicuna_13b.yaml"`)

**Outputs**
Return Type	Description
`Type[BaseModel]`	A model class (not an instance) that exposes `process_fn()` and `generate()` methods. Must be constructed by the caller with backend-specific arguments.

Usage Examples

1. Basic usage with a direct model name:

llm_cls = AutoLLM.from_name("gpt35")
llm = llm_cls(config="config/gpt35.yaml")
output = llm.generate(llm.process_fn(prompt))

2. Loading from a YAML config file (HuggingFace backend):

llm_cls = AutoLLM.from_name("config/vicuna_13b.yaml")
llm = llm_cls(config="config/vicuna_13b.yaml", accelerator=accelerator)
output = llm.generate(llm.process_fn(prompt))

3. Using vLLM with tensor parallelism:

llm_cls = AutoLLM.from_name("mistral")
llm = llm_cls(config="config/mistral_7b.yaml", tensor_parallel_size=4)
output = llm.generate(llm.process_fn(prompt))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment