Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:FMInference FlexLLMGen Get Opt Config

From Leeroopedia


Field Value
Sources Repo: FlexLLMGen
Domains Model_Architecture, Configuration
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for resolving OPT model names to architecture configurations provided by the FlexLLMGen library.

Description

get_opt_config() takes a model name string, strips the organization prefix (e.g., "facebook/"), handles IML variants, and returns an OptConfig frozen dataclass with architectural parameters. The function supports the full range of OPT models from OPT-125M through OPT-175B, plus Galactica-30B.

The OptConfig dataclass contains all parameters needed to define the model architecture:

  • Structural parameters -- num_hidden_layers, hidden_size, n_head, input_dim, ffn_embed_dim define the Transformer dimensions.
  • Sequence parameters -- max_seq_len defines the maximum sequence length (default 2048).
  • Vocabulary parameters -- vocab_size and pad_token_id define the tokenizer interface.
  • Numerical parameters -- dtype (default np.float16) and layer_norm_eps control numerical precision.
  • Utility methods -- model_bytes(), cache_bytes(), hidden_bytes() compute memory requirements.

The function also accepts **kwargs to override any config field, enabling custom configurations for testing or experimentation.

Usage

Call get_opt_config() before creating OptLM to get the model's architecture specification. The returned OptConfig is passed to OptLM's constructor along with the ExecutionEnv and Policy.

Code Reference

Field Value
Repository FlexLLMGen
File flexllmgen/opt_config.py
Lines 17-125

Signature:

@dataclasses.dataclass(frozen=True)
class OptConfig:
    name: str = "opt-125m"
    num_hidden_layers: int = 12
    max_seq_len: int = 2048
    hidden_size: int = 768
    n_head: int = 12
    input_dim: int = 768
    ffn_embed_dim: int = 3072
    pad: int = 1
    activation_fn: str = 'relu'
    vocab_size: int = 50272
    layer_norm_eps: float = 0.00001
    pad_token_id: int = 1
    dtype: type = np.float16

def get_opt_config(name, **kwargs):
    # Resolves name -> OptConfig
    ...
    return dataclasses.replace(config, **kwargs)

Import:

from flexllmgen.opt_config import OptConfig, get_opt_config

I/O Contract

Inputs

Parameter Type Required Description
name str Yes Model name (e.g., "facebook/opt-30b" or "opt-6.7b")
**kwargs Any No Override config fields (e.g., max_seq_len=1024)

Outputs

Output Type Description
OptConfig frozen dataclass Complete architecture specification
.name str Normalized model name
.num_hidden_layers int Number of Transformer decoder layers
.max_seq_len int Maximum sequence length
.hidden_size int Hidden representation dimensionality
.n_head int Number of attention heads
.input_dim int Input embedding dimension
.ffn_embed_dim int Feed-forward network intermediate dimension
.pad int Padding token index
.activation_fn str Activation function name
.vocab_size int Vocabulary size
.layer_norm_eps float Layer normalization epsilon
.pad_token_id int Padding token ID
.dtype type Data type for model parameters

Usage Examples

Example 1: Get configuration for OPT-6.7B

from flexllmgen.opt_config import get_opt_config

config = get_opt_config("facebook/opt-6.7b")

print(config.name)              # "opt-6.7b"
print(config.num_hidden_layers) # 32
print(config.hidden_size)       # 4096
print(config.n_head)            # 32
print(config.ffn_embed_dim)     # 16384

Example 2: Get configuration for OPT-175B

from flexllmgen.opt_config import get_opt_config

config_175b = get_opt_config("facebook/opt-175b")

print(config_175b.name)              # "opt-175b"
print(config_175b.num_hidden_layers) # 96
print(config_175b.hidden_size)       # 12288
print(config_175b.n_head)            # 96
print(config_175b.ffn_embed_dim)     # 49152

Example 3: Compute memory requirements with model_bytes()

from flexllmgen.opt_config import get_opt_config

config = get_opt_config("facebook/opt-30b")

# Compute total model weight size in bytes
total_weight_bytes = config.model_bytes()
print(f"Model weights: {total_weight_bytes / (1024**3):.1f} GB")

# Compute KV cache size for a given batch and sequence length
batch_size = 8
seq_len = 2048
cache_bytes = config.cache_bytes(batch_size, seq_len)
print(f"KV cache: {cache_bytes / (1024**3):.1f} GB")

# Compute hidden state activation size
hidden_bytes = config.hidden_bytes(batch_size, seq_len)
print(f"Hidden states: {hidden_bytes / (1024**3):.1f} GB")

Example 4: Override config fields

from flexllmgen.opt_config import get_opt_config

# Get OPT-6.7B config but with shorter max sequence length
config = get_opt_config("facebook/opt-6.7b", max_seq_len=1024)
print(config.max_seq_len)  # 1024

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment