Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Llama cpp Convert PT To HF

From Leeroopedia
Knowledge Sources
Domains Text_To_Speech, Model_Conversion
Last Updated 2026-02-15 00:00 GMT

Overview

Converts WavTokenizer PyTorch checkpoint files to HuggingFace-compatible format (safetensors + config.json) for subsequent GGUF conversion.

Description

This Python script loads a PyTorch checkpoint (.ckpt or .pt), flattens its state dictionary, and renames tensor keys to match expected HuggingFace naming conventions by removing the `state_dict.` prefix and handling specific key patterns (e.g., posnet renaming, backbone layer remapping). It filters tensors to keep only inference-relevant weights (feature_extractor.encodec.quantizer, backbone, head.out), saves them to `model.safetensors` using the safetensors library, generates an `index.json` file mapping tensor names to their shapes and dtypes, and writes a `config.json` with model hyperparameters extracted from the checkpoint.

Usage

Use this script as a prerequisite conversion step in the TTS pipeline to transform WavTokenizer audio decoder weights from PyTorch format into a form that `convert_hf_to_gguf.py` can then process into GGUF format.

Code Reference

Source Location

Signature

def flatten_state_dict(state_dict, parent_key='', sep='.'):
    """Flatten nested state dict and rename keys for HuggingFace compatibility."""
    ...

Import

import torch
import json
import os
import sys
import re
from safetensors.torch import save_file

I/O Contract

Inputs

Name Type Required Description
model_path string (CLI arg or default) Yes Path to the PyTorch checkpoint file (.ckpt or .pt), defaults to './model.pt'

Outputs

Name Type Description
model.safetensors file Flattened and renamed model tensors in safetensors format
index.json file Mapping of tensor names to shapes and dtypes
config.json file Model hyperparameters (architecture config) extracted from checkpoint

Usage Examples

# Convert a WavTokenizer checkpoint to HuggingFace format
python tools/tts/convert_pt_to_hf.py ./wavtokenizer-large-speech-75token.ckpt

# Then convert from HuggingFace to GGUF
python convert_hf_to_gguf.py ./wavtokenizer-large-speech-75token/

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment