Implementation:SeldonIO Seldon core Transformers Pipeline Save Pretrained

Field	Value
Type	API Doc
Overview	Concrete tools for downloading and serializing HuggingFace models provided by the transformers library.
Source	`samples/scripts/models/huggingface-text-gen-gpt2/train.py:L1-19`
Domains	NLP, Model_Serialization
Implements Principle	SeldonIO_Seldon_core_HuggingFace_Model_Preparation
External Dependencies	transformers (GPT2Tokenizer, TFGPT2LMHeadModel, pipeline)
Knowledge Sources	Repo (https://github.com/SeldonIO/seldon-core), Doc (https://huggingface.co/docs/transformers)
Last Updated	2026-02-13 00:00 GMT

Code Reference

The following script downloads the GPT-2 model and tokenizer from the HuggingFace Hub, wraps them in a text-generation pipeline, and serializes the entire pipeline to disk:

from transformers import (
    GPT2Tokenizer,
    TFGPT2LMHeadModel,
    pipeline,
)

def main() -> None:
    tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
    model = TFGPT2LMHeadModel.from_pretrained("gpt2")
    p = pipeline(task="text-generation", model=model, tokenizer=tokenizer)
    p.save_pretrained("text-generation-model-artefacts")

if __name__ == "__main__":
    print("Building a custom GPT2 HuggingFace model...")
    main()

Key Parameters

Parameter	Value	Description
model identifier	`"gpt2"`	HuggingFace Hub model ID for the GPT-2 base model (124M parameters)
task	`"text-generation"`	Pipeline task type; determines how input/output is processed
save path	`"text-generation-model-artefacts"`	Local directory where serialized pipeline artifacts are written

API Methods

Method	Class	Purpose
`from_pretrained("gpt2")`	GPT2Tokenizer	Downloads tokenizer vocabulary and merges from HuggingFace Hub
`from_pretrained("gpt2")`	TFGPT2LMHeadModel	Downloads TensorFlow model weights from HuggingFace Hub
`pipeline(task, model, tokenizer)`	transformers	Creates a high-level inference pipeline combining model and tokenizer
`save_pretrained(path)`	Pipeline	Serializes the full pipeline (model, tokenizer, config) to a directory

I/O Contract

Inputs

Input	Source	Description
HuggingFace model hub	Network (https://huggingface.co)	Downloads GPT-2 tokenizer vocabulary files and TensorFlow model weights

Outputs

Output	Format	Description
`text-generation-model-artefacts/`	Directory	Contains serialized tokenizer files, model weights (`tf_model.h5`), `config.json`, and `tokenizer_config.json`

The output directory structure typically includes:

config.json -- model architecture configuration
tf_model.h5 -- TensorFlow model weights
vocab.json -- tokenizer vocabulary
merges.txt -- BPE merge rules
special_tokens_map.json -- special token definitions
tokenizer_config.json -- tokenizer and task metadata

Usage Examples

Running the training script

cd samples/scripts/models/huggingface-text-gen-gpt2
python train.py

Uploading artifacts to GCS

After serialization, upload the artifacts to Google Cloud Storage for use with Seldon Core 2:

gsutil cp -r text-generation-model-artefacts gs://seldon-models/mlserver/huggingface/text-gen

Adapting for other model types

To prepare a sentiment analysis model instead of text generation:

from transformers import pipeline

p = pipeline(task="sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
p.save_pretrained("sentiment-model-artefacts")

Related Pages

SeldonIO_Seldon_core_HuggingFace_Model_Preparation -- principle that this implementation realizes
SeldonIO_Seldon_core_Seldon_Model_CRD_HuggingFace -- consumed by the Model CRD that references the serialized artifacts
SeldonIO_Seldon_core_HuggingFace_Model_Resource_Definition -- feeds into the model resource definition that declares the stored artifacts
Environment:SeldonIO_Seldon_core_Python_ML_Dependencies_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment