Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:OpenGVLab InternVL FCNHead Custom

From Leeroopedia


Knowledge Sources
Domains Segmentation, Decode Head, Model Architecture
Last Updated 2026-02-07 14:00 GMT

Overview

Custom FCN (Fully Convolutional Network) decode head for semantic segmentation that supports zero-convolution mode for linear probing and explicit FP32 casting for numerical stability with BFloat16 backbone features.

Description

FCNHead extends MMSeg's BaseDecodeHead and is force-registered with the HEADS registry. It implements the standard FCN architecture from FCNNet with key customizations:

  • num_convs=0 mode: When zero convolutions are specified, the conv layers become an nn.Identity, making the head a simple linear classifier. This enables linear probing experiments where the backbone features are directly classified without additional convolutional processing. An assertion ensures in_channels equals channels in this mode.
  • Optional SyncBatchNorm: The with_norm parameter adds an nn.SyncBatchNorm layer after feature extraction (or nn.Identity when disabled), useful for multi-GPU training consistency.
  • FP32 forward pass: The forward() method explicitly calls self.to(torch.float32) before processing, ensuring numerical stability when receiving BFloat16 features from the InternViT backbone. This prevents precision-related issues in the classification head.
  • Concat input: When concat_input=True (default), the original input features and convolution outputs are concatenated before the final classification convolution (conv_cat).

Usage

Use this decode head in segmentation configs for InternVL backbone models, especially for linear probing (num_convs=0) and few-shot experiments.

Code Reference

Source Location

Signature

@HEADS.register_module(force=True)
class FCNHead(BaseDecodeHead):
    def __init__(self, num_convs=2, kernel_size=3, concat_input=True,
                 dilation=1, with_norm=False, **kwargs): ...
    def _forward_feature(self, inputs): ...
    def forward(self, inputs): ...

Import

from mmseg_custom.models.decode_heads.fcn_head import FCNHead

I/O Contract

Inputs

Name Type Required Description
inputs list[Tensor] Yes Multi-level feature maps from the backbone
num_convs int No Number of convolution layers (0 for linear probing, default: 2)
kernel_size int No Convolution kernel size (default: 3)
concat_input bool No Whether to concatenate input with conv output (default: True)
with_norm bool No Whether to use SyncBatchNorm (default: False)

Outputs

Name Type Description
output Tensor Segmentation logits of shape (B, num_classes, H, W)

Usage Examples

Basic Usage

# In MMSegmentation config for linear probing:
decode_head=dict(
    type='FCNHead',
    in_channels=3200,
    channels=3200,
    num_convs=0,         # Linear probing mode
    num_classes=150,
    with_norm=True,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment