Principle:OpenGVLab InternVL Segmentation Dataset Configuration
| Knowledge Sources | |
|---|---|
| Domains | Segmentation, Dataset, Evaluation |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
The segmentation dataset configuration principle defines how semantic segmentation datasets are structured, registered, and extended with features like few-shot subset sampling and custom result formatting.
Description
This principle governs how segmentation datasets are configured within the MMSegmentation framework for InternVL experiments:
- Class and palette definitions: Each dataset defines its complete set of semantic class names and corresponding RGB color palettes for consistent visualization and evaluation.
- Label convention handling: Datasets handle the mapping between raw annotation indices and model output indices. For ADE20K, the background class (index 0) is excluded from the 150 categories via reduce_zero_label=True, and predictions are re-indexed (+1) when writing output files.
- Few-shot sampling: A max_image_num parameter enables random subset selection from the full dataset, supporting few-shot segmentation experiments where models are trained on limited data fractions.
- Result formatting: Custom methods convert model predictions to standard evaluation formats (PNG images for ADE20K, or Cityscapes format), ensuring compatibility with benchmark evaluation servers.
- Registry force-registration: Custom dataset classes use force=True in registry registration to override default MMSeg implementations, ensuring InternVL-specific extensions are always used.
Usage
Apply this principle when defining or extending segmentation datasets for evaluation with InternVL backbones, particularly for few-shot or benchmark experiments.
Theoretical Basis
Few-shot semantic segmentation is an active research area that evaluates how well pretrained features transfer to new visual understanding tasks with limited supervision. The dataset configuration principle enables systematic evaluation across different data fractions.