Implementation:OpenGVLab InternVL Phi3Config
| Knowledge Sources | |
|---|---|
| Domains | Model Configuration, Language Model, Phi3 |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
Defines the Phi3Config configuration class for the Phi-3 language model, storing all architectural hyperparameters needed to instantiate a Phi-3 model within InternVL.
Description
Phi3Config extends HuggingFace's PretrainedConfig with Phi-3-specific parameters:
- Core architecture -- vocab_size (32064), hidden_size (3072), intermediate_size (8192), num_hidden_layers (32), num_attention_heads (32).
- Grouped Query Attention -- num_key_value_heads defaults to num_attention_heads if not specified.
- Dropout rates -- resid_pdrop (0.0), embd_pdrop (0.0), attention_dropout (0.0).
- Position embeddings -- max_position_embeddings (4096), original_max_position_embeddings (4096) for long-scaling comparison.
- RoPE configuration -- rope_theta (10000.0) and rope_scaling supporting su and yarn scaling types with short_factor and long_factor lists.
- Sliding window -- Optional sliding_window attention support.
- Normalization -- rms_norm_eps (1e-5).
- Activation -- hidden_act ("silu").
The _rope_scaling_validation method enforces that rope_scaling dictionaries contain exactly three fields (type, short_factor, long_factor) with correct types and lengths matching hidden_size // num_attention_heads // 2.
A pretrained config archive map points to Phi-3-mini-4k-instruct and Phi-3-mini-128k-instruct on HuggingFace.
Usage
Use this configuration when InternVL uses Microsoft Phi-3 as its language model backbone. The config is loaded automatically via AutoConfig or directly instantiated for custom configurations.
Code Reference
Source Location
- Repository: OpenGVLab_InternVL
- File: internvl_chat/internvl/model/phi3/configuration_phi3.py
- Lines: 1-211
Signature
class Phi3Config(PretrainedConfig):
model_type = 'phi3'
keys_to_ignore_at_inference = ['past_key_values']
def __init__(self, vocab_size=32064, hidden_size=3072,
intermediate_size=8192, num_hidden_layers=32,
num_attention_heads=32, num_key_value_heads=None,
resid_pdrop=0.0, embd_pdrop=0.0,
attention_dropout=0.0, hidden_act='silu',
max_position_embeddings=4096,
original_max_position_embeddings=4096,
rms_norm_eps=1e-5, rope_theta=10000.0,
rope_scaling=None, sliding_window=None,
**kwargs): ...
Import
from internvl.model.phi3.configuration_phi3 import Phi3Config
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| vocab_size | int | No | Vocabulary size (default: 32064) |
| hidden_size | int | No | Hidden dimension (default: 3072) |
| num_hidden_layers | int | No | Number of transformer layers (default: 32) |
| num_attention_heads | int | No | Number of attention heads (default: 32) |
| rope_scaling | dict | No | RoPE scaling config with type, short_factor, and long_factor fields |
| sliding_window | int | No | Sliding window attention size (default: None) |
Outputs
| Name | Type | Description |
|---|---|---|
| config | Phi3Config | Configuration object for Phi-3 model instantiation |
Usage Examples
Basic Usage
from internvl.model.phi3.configuration_phi3 import Phi3Config
# Create a default Phi-3 config
config = Phi3Config()
# Load from pretrained
config = Phi3Config.from_pretrained("microsoft/Phi-3-mini-4k-instruct")