Implementation:OpenGVLab InternVL FCNHead Custom

Knowledge Sources	OpenGVLab_InternVL
Domains	Segmentation, Decode Head, Model Architecture
Last Updated	2026-02-07 14:00 GMT

Overview

Custom FCN (Fully Convolutional Network) decode head for semantic segmentation that supports zero-convolution mode for linear probing and explicit FP32 casting for numerical stability with BFloat16 backbone features.

Description

FCNHead extends MMSeg's BaseDecodeHead and is force-registered with the HEADS registry. It implements the standard FCN architecture from FCNNet with key customizations:

num_convs=0 mode: When zero convolutions are specified, the conv layers become an nn.Identity, making the head a simple linear classifier. This enables linear probing experiments where the backbone features are directly classified without additional convolutional processing. An assertion ensures in_channels equals channels in this mode.

Optional SyncBatchNorm: The with_norm parameter adds an nn.SyncBatchNorm layer after feature extraction (or nn.Identity when disabled), useful for multi-GPU training consistency.

FP32 forward pass: The forward() method explicitly calls self.to(torch.float32) before processing, ensuring numerical stability when receiving BFloat16 features from the InternViT backbone. This prevents precision-related issues in the classification head.

Concat input: When concat_input=True (default), the original input features and convolution outputs are concatenated before the final classification convolution (conv_cat).

Usage

Use this decode head in segmentation configs for InternVL backbone models, especially for linear probing (num_convs=0) and few-shot experiments.

Code Reference

Source Location

Repository: OpenGVLab_InternVL
File: segmentation/mmseg_custom/models/decode_heads/fcn_head.py
Lines: 1-102

Signature

@HEADS.register_module(force=True)
class FCNHead(BaseDecodeHead):
    def __init__(self, num_convs=2, kernel_size=3, concat_input=True,
                 dilation=1, with_norm=False, **kwargs): ...
    def _forward_feature(self, inputs): ...
    def forward(self, inputs): ...

Import

from mmseg_custom.models.decode_heads.fcn_head import FCNHead

I/O Contract

Inputs

Name	Type	Required	Description
inputs	list[Tensor]	Yes	Multi-level feature maps from the backbone
num_convs	int	No	Number of convolution layers (0 for linear probing, default: 2)
kernel_size	int	No	Convolution kernel size (default: 3)
concat_input	bool	No	Whether to concatenate input with conv output (default: True)
with_norm	bool	No	Whether to use SyncBatchNorm (default: False)

Outputs

Name	Type	Description
output	Tensor	Segmentation logits of shape (B, num_classes, H, W)

Usage Examples

Basic Usage

# In MMSegmentation config for linear probing:
decode_head=dict(
    type='FCNHead',
    in_channels=3200,
    channels=3200,
    num_convs=0,         # Linear probing mode
    num_classes=150,
    with_norm=True,
)

Related Pages

Principle:OpenGVLab_InternVL_Segmentation_Decode_Head

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment