Implementation:Zai org CogVideo ActNorm

Knowledge Sources	Zai_org_CogVideo
Domains	Video_Generation, Normalization
Last Updated	2026-02-10 00:00 GMT

Overview

Implements ActNorm (Activation Normalization), a data-dependent normalization layer that initializes its affine parameters from the first mini-batch statistics and supports both forward and reverse transformation modes.

Description

This module provides the ActNorm class along with checkpoint download and verification utilities:

ActNorm -- A normalization layer that performs a per-channel affine transformation y = scale * (x + loc). Unlike BatchNorm, ActNorm uses data-dependent initialization: on the first forward pass, the location (loc) and scale parameters are set so that the output has zero mean and unit variance per channel. After initialization, the parameters are treated as regular learnable parameters. Key features include:
- Data-dependent initialization: Computes channel-wise mean and standard deviation from the first mini-batch and sets loc = -mean, scale = 1/std.
- Reverse mode: Supports invertible computation via reverse=True, computing x = y / scale - loc.
- Log-determinant: Optionally computes log|det(dy/dx)| = H * W * sum(log|scale|) for use in normalizing flow models.
- 2D input support: Handles both 4D [B, C, H, W] and 2D [B, C] inputs (the latter are temporarily reshaped to 4D).

get_ckpt_path() -- Resolves a model checkpoint name (e.g., "vgg_lpips") to a local file path, downloading from a remote URL with MD5 verification if the file is missing.
download() -- Streams a file from a URL with a progress bar.
md5_hash() -- Computes the MD5 hash of a file for integrity verification.

Usage

Used as an alternative to nn.BatchNorm2d in the NLayerDiscriminator when use_actnorm=True. Also useful in normalizing flow architectures where invertible normalization with known log-determinant is required. The checkpoint utilities support the LPIPS module by ensuring pretrained VGG weights are available locally.

Code Reference

Source Location

Repository: Zai_org_CogVideo
File: sat/sgm/modules/autoencoding/lpips/util.py

Signature

class ActNorm(nn.Module):
    def __init__(
        self,
        num_features,
        logdet=False,
        affine=True,
        allow_reverse_init=False,
    )

    def initialize(self, input)
    def forward(self, input, reverse=False) -> Union[torch.Tensor, tuple[torch.Tensor, torch.Tensor]]
    def reverse(self, output) -> torch.Tensor

def get_ckpt_path(name, root, check=False) -> str
def download(url, local_path, chunk_size=1024)
def md5_hash(path) -> str

Import

from sat.sgm.modules.autoencoding.lpips.util import ActNorm, get_ckpt_path

I/O Contract

Inputs (ActNorm.forward)

Name	Type	Required	Description
input	`torch.Tensor`	Yes	Input tensor of shape `[B, C, H, W]` or `[B, C]`
reverse	`bool`	No	If True, apply the inverse transformation. Default: False

Constructor Parameters

Name	Type	Required	Description
num_features	`int`	Yes	Number of channels (features) for the normalization
logdet	`bool`	No	Whether to return the log-determinant of the Jacobian. Default: False
affine	`bool`	No	Must be True (asserted). Default: True
allow_reverse_init	`bool`	No	Allow initialization from reverse pass. Default: False

Outputs (ActNorm.forward)

Name	Type	Description
h	`torch.Tensor`	Normalized output, same shape as input
logdet	`torch.Tensor`	(Only if `logdet=True`) Log-determinant of shape `[B]`

Usage Examples

from sat.sgm.modules.autoencoding.lpips.util import ActNorm

# Create an ActNorm layer for 64-channel features
norm = ActNorm(num_features=64, logdet=False)

# First forward pass triggers data-dependent initialization
output = norm(feature_map)  # feature_map: [B, 64, H, W]

# With log-determinant for flow models
norm_flow = ActNorm(num_features=64, logdet=True)
output, log_det = norm_flow(feature_map)

# Reverse transformation (invertible)
reconstructed = norm_flow(output, reverse=True)

Related Pages

Principle:Zai_org_CogVideo_Activation_Normalization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment