Implementation:Ggml org Llama cpp Convert PT To HF

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Text_To_Speech, Model_Conversion
Last Updated	2026-02-15 00:00 GMT

Overview

Converts WavTokenizer PyTorch checkpoint files to HuggingFace-compatible format (safetensors + config.json) for subsequent GGUF conversion.

Description

This Python script loads a PyTorch checkpoint (.ckpt or .pt), flattens its state dictionary, and renames tensor keys to match expected HuggingFace naming conventions by removing the `state_dict.` prefix and handling specific key patterns (e.g., posnet renaming, backbone layer remapping). It filters tensors to keep only inference-relevant weights (feature_extractor.encodec.quantizer, backbone, head.out), saves them to `model.safetensors` using the safetensors library, generates an `index.json` file mapping tensor names to their shapes and dtypes, and writes a `config.json` with model hyperparameters extracted from the checkpoint.

Usage

Use this script as a prerequisite conversion step in the TTS pipeline to transform WavTokenizer audio decoder weights from PyTorch format into a form that `convert_hf_to_gguf.py` can then process into GGUF format.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: tools/tts/convert_pt_to_hf.py
Lines: 1-180

Signature

def flatten_state_dict(state_dict, parent_key='', sep='.'):
    """Flatten nested state dict and rename keys for HuggingFace compatibility."""
    ...

Import

import torch
import json
import os
import sys
import re
from safetensors.torch import save_file

I/O Contract

Inputs

Name	Type	Required	Description
model_path	string (CLI arg or default)	Yes	Path to the PyTorch checkpoint file (.ckpt or .pt), defaults to './model.pt'

Outputs

Name	Type	Description
model.safetensors	file	Flattened and renamed model tensors in safetensors format
index.json	file	Mapping of tensor names to shapes and dtypes
config.json	file	Model hyperparameters (architecture config) extracted from checkpoint

Usage Examples

# Convert a WavTokenizer checkpoint to HuggingFace format
python tools/tts/convert_pt_to_hf.py ./wavtokenizer-large-speech-75token.ckpt

# Then convert from HuggingFace to GGUF
python convert_hf_to_gguf.py ./wavtokenizer-large-speech-75token/

Related Pages

Principle:Ggml_org_Llama_cpp_Text_To_Speech

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment