Implementation:Zai org CogVideo Args Parse Args
| Implementation Metadata | |
|---|---|
| Name | Args_Parse_Args |
| Type | API Doc |
| Category | Configuration_Management |
| Domains | Fine_Tuning, Diffusion_Models |
| Knowledge Sources | CogVideo Repository |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Args_Parse_Args is a concrete tool for parsing and validating CogVideoX fine-tuning configuration, provided by the CogVideo finetune package.
Description
This implementation provides the Args Pydantic model class and its parse_args classmethod for constructing a fully validated training configuration from command-line arguments. The Args class defines all fields needed for CogVideoX fine-tuning, including model paths, LoRA settings, training hyperparameters, and validation options. Pydantic validators enforce constraints such as frame count divisibility, resolution compatibility, and precision requirements.
Usage
Use at the entry point of any CogVideoX fine-tuning script. Call Args.parse_args() to parse command-line arguments and receive a validated configuration object. This is typically the first operation in the training pipeline.
Code Reference
Source Location
finetune/schemas/args.py:L10-177--ArgsPydantic model definitionfinetune/schemas/args.py:L178-254--parse_argsclassmethod
Signature
class Args(BaseModel):
model_path: Path
model_name: str
model_type: Literal["i2v", "t2v"]
training_type: Literal["lora", "sft"] = "lora"
output_dir: Path
data_root: Path
caption_column: Path
video_column: Path
train_epochs: int
batch_size: int
train_resolution: Tuple[int, int, int] # (frames, height, width)
mixed_precision: Literal["no", "fp16", "bf16"]
rank: int = 128
lora_alpha: int = 64
target_modules: List[str] = ["to_q", "to_k", "to_v", "to_out.0"]
learning_rate: float = 1e-4
gradient_accumulation_steps: int = 1
max_grad_norm: float = 1.0
checkpointing_steps: int = 200
checkpointing_limit: int = 10
do_validation: bool = True
validation_steps: int = 200
seed: int = 42
nccl_timeout: int = 1800
@classmethod
def parse_args(cls) -> "Args":
"""Parse command-line arguments and return a validated Args instance."""
...
Import
from finetune.schemas import Args
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model_path |
Path |
required | Path to pretrained CogVideoX model checkpoint. |
model_name |
str |
required | Model variant name (e.g., "cogvideox-5b").
|
model_type |
Literal["i2v", "t2v"] |
required | Whether the model is text-to-video or image-to-video. |
training_type |
Literal["lora", "sft"] |
"lora" |
Training mode: LoRA adapter or full supervised fine-tuning. |
train_resolution |
Tuple[int, int, int] |
required | Target resolution as (frames, height, width).
|
rank |
int |
128 |
LoRA rank (dimension of low-rank matrices). |
lora_alpha |
int |
64 |
LoRA scaling factor. |
target_modules |
List[str] |
["to_q", "to_k", "to_v", "to_out.0"] |
Transformer modules to apply LoRA adapters to. |
learning_rate |
float |
1e-4 |
Optimizer learning rate. |
mixed_precision |
Literal["no", "fp16", "bf16"] |
required | Mixed precision training mode. |
External Dependencies
pydantic-- Schema validation and type enforcementargparse-- Command-line argument parsing
I/O Contract
Inputs
| Input | Format | Description |
|---|---|---|
| Command-line arguments | Shell arguments or script variables | Training parameters passed via CLI (e.g., --model_path, --train_resolution 49 480 720).
|
Outputs
| Output | Format | Description |
|---|---|---|
| Configuration object | Args (Pydantic model instance) |
Fully validated training configuration with all fields populated and constraints checked. |
Usage Examples
Basic Argument Parsing
from finetune.schemas import Args
# Parse and validate from command line
args = Args.parse_args()
# Access validated fields
print(f"Model: {args.model_name}")
print(f"Resolution: {args.train_resolution}")
print(f"LoRA rank: {args.rank}")
print(f"Training type: {args.training_type}")
Shell Script Invocation
python train.py \
--model_path /models/cogvideox-5b \
--model_name cogvideox-5b \
--model_type t2v \
--training_type lora \
--output_dir /output/lora_run \
--data_root /data/my_videos \
--caption_column prompts.txt \
--video_column videos.txt \
--train_epochs 100 \
--batch_size 1 \
--train_resolution 49 480 720 \
--mixed_precision bf16 \
--rank 128 \
--lora_alpha 64
Related Pages
- Principle:Zai_org_CogVideo_Training_Configuration
- Environment:Zai_org_CogVideo_Diffusers_Finetuning_Environment
- Heuristic:Zai_org_CogVideo_BF16_FP16_Precision_Selection
- Heuristic:Zai_org_CogVideo_Frame_Count_and_Resolution_Constraints
- Heuristic:Zai_org_CogVideo_LoRA_Configuration_Tips
- Heuristic:Zai_org_CogVideo_Training_Hyperparameter_Defaults
- Implementation:Zai_org_CogVideo_T2V_I2V_Dataset_Loader
- Implementation:Zai_org_CogVideo_CogVideoX_LoRA_Trainer_Load_Components