Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:OpenGVLab InternVL InternLM2Config

From Leeroopedia


Knowledge Sources
Domains Model Configuration, Language Model, InternLM2
Last Updated 2026-02-07 14:00 GMT

Overview

Defines the InternLM2Config configuration class that stores all architectural hyperparameters for the InternLM2 language model used as a backbone in InternVL.

Description

InternLM2Config extends HuggingFace's PretrainedConfig with InternLM2-specific parameters:

  • vocab_size (default 103168) -- Vocabulary size for the InternLM2 tokenizer.
  • hidden_size (default 4096) -- Dimension of the hidden representations.
  • intermediate_size (default 11008) -- Dimension of the MLP representations.
  • num_hidden_layers (default 32) -- Number of transformer decoder layers.
  • num_attention_heads (default 32) -- Number of attention heads, with support for Grouped Query Attention (GQA) via num_key_value_heads.
  • hidden_act (default "silu") -- SiLU activation function.
  • rope_theta (default 10000) -- Base period for Rotary Position Embeddings.
  • rope_scaling -- Optional dictionary with type ("linear" or "dynamic") and factor for NTK-aware RoPE scaling, validated by _rope_scaling_validation.
  • attn_implementation (default "eager") -- Attention backend selection (eager vs flash_attention_2).
  • bias (default True) -- Whether to use bias in linear layers.

The class sets model_type to "internlm2" and _auto_class to "AutoConfig" for HuggingFace auto-class integration. If num_key_value_heads is not specified, it defaults to num_attention_heads (standard MHA).

Usage

Use this configuration class when instantiating InternLM2 models within InternVL. It is loaded automatically via AutoConfig.from_pretrained() for pretrained InternLM2-based InternVL models.

Code Reference

Source Location

Signature

class InternLM2Config(PretrainedConfig):
    model_type = 'internlm2'
    _auto_class = 'AutoConfig'

    def __init__(self, vocab_size=103168, hidden_size=4096,
                 intermediate_size=11008, num_hidden_layers=32,
                 num_attention_heads=32, num_key_value_heads=None,
                 hidden_act='silu', max_position_embeddings=2048,
                 initializer_range=0.02, rms_norm_eps=1e-6,
                 use_cache=True, rope_theta=10000,
                 rope_scaling=None, attn_implementation='eager',
                 **kwargs): ...

Import

from internvl.model.internlm2.configuration_internlm2 import InternLM2Config

I/O Contract

Inputs

Name Type Required Description
vocab_size int No Vocabulary size (default: 103168)
hidden_size int No Hidden dimension (default: 4096)
num_hidden_layers int No Number of transformer layers (default: 32)
num_attention_heads int No Number of attention heads (default: 32)
num_key_value_heads int No Number of KV heads for GQA (default: same as num_attention_heads)
rope_scaling dict No RoPE scaling config with type and factor fields
attn_implementation str No Attention implementation: "eager" or "flash_attention_2"

Outputs

Name Type Description
config InternLM2Config Configuration object for InternLM2 model instantiation

Usage Examples

Basic Usage

from internvl.model.internlm2.configuration_internlm2 import InternLM2Config

# Create a default config
config = InternLM2Config()

# Create with GQA (8 KV heads for 32 attention heads)
config = InternLM2Config(num_key_value_heads=8)

# Create with dynamic RoPE scaling
config = InternLM2Config(
    rope_scaling={"type": "dynamic", "factor": 2.0}
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment