Implementation:OpenGVLab InternVL FCNHead Custom
| Knowledge Sources | |
|---|---|
| Domains | Segmentation, Decode Head, Model Architecture |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
Custom FCN (Fully Convolutional Network) decode head for semantic segmentation that supports zero-convolution mode for linear probing and explicit FP32 casting for numerical stability with BFloat16 backbone features.
Description
FCNHead extends MMSeg's BaseDecodeHead and is force-registered with the HEADS registry. It implements the standard FCN architecture from FCNNet with key customizations:
- num_convs=0 mode: When zero convolutions are specified, the conv layers become an nn.Identity, making the head a simple linear classifier. This enables linear probing experiments where the backbone features are directly classified without additional convolutional processing. An assertion ensures in_channels equals channels in this mode.
- Optional SyncBatchNorm: The with_norm parameter adds an nn.SyncBatchNorm layer after feature extraction (or nn.Identity when disabled), useful for multi-GPU training consistency.
- FP32 forward pass: The forward() method explicitly calls self.to(torch.float32) before processing, ensuring numerical stability when receiving BFloat16 features from the InternViT backbone. This prevents precision-related issues in the classification head.
- Concat input: When concat_input=True (default), the original input features and convolution outputs are concatenated before the final classification convolution (conv_cat).
Usage
Use this decode head in segmentation configs for InternVL backbone models, especially for linear probing (num_convs=0) and few-shot experiments.
Code Reference
Source Location
- Repository: OpenGVLab_InternVL
- File: segmentation/mmseg_custom/models/decode_heads/fcn_head.py
- Lines: 1-102
Signature
@HEADS.register_module(force=True)
class FCNHead(BaseDecodeHead):
def __init__(self, num_convs=2, kernel_size=3, concat_input=True,
dilation=1, with_norm=False, **kwargs): ...
def _forward_feature(self, inputs): ...
def forward(self, inputs): ...
Import
from mmseg_custom.models.decode_heads.fcn_head import FCNHead
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| inputs | list[Tensor] | Yes | Multi-level feature maps from the backbone |
| num_convs | int | No | Number of convolution layers (0 for linear probing, default: 2) |
| kernel_size | int | No | Convolution kernel size (default: 3) |
| concat_input | bool | No | Whether to concatenate input with conv output (default: True) |
| with_norm | bool | No | Whether to use SyncBatchNorm (default: False) |
Outputs
| Name | Type | Description |
|---|---|---|
| output | Tensor | Segmentation logits of shape (B, num_classes, H, W) |
Usage Examples
Basic Usage
# In MMSegmentation config for linear probing:
decode_head=dict(
type='FCNHead',
in_channels=3200,
channels=3200,
num_convs=0, # Linear probing mode
num_classes=150,
with_norm=True,
)