Implementation:SeldonIO Seldon core Transformers Pipeline Save Pretrained
Appearance
| Field | Value |
|---|---|
| Type | API Doc |
| Overview | Concrete tools for downloading and serializing HuggingFace models provided by the transformers library. |
| Source | samples/scripts/models/huggingface-text-gen-gpt2/train.py:L1-19
|
| Domains | NLP, Model_Serialization |
| Implements Principle | SeldonIO_Seldon_core_HuggingFace_Model_Preparation |
| External Dependencies | transformers (GPT2Tokenizer, TFGPT2LMHeadModel, pipeline) |
| Knowledge Sources | Repo (https://github.com/SeldonIO/seldon-core), Doc (https://huggingface.co/docs/transformers) |
| Last Updated | 2026-02-13 00:00 GMT |
Code Reference
The following script downloads the GPT-2 model and tokenizer from the HuggingFace Hub, wraps them in a text-generation pipeline, and serializes the entire pipeline to disk:
from transformers import (
GPT2Tokenizer,
TFGPT2LMHeadModel,
pipeline,
)
def main() -> None:
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = TFGPT2LMHeadModel.from_pretrained("gpt2")
p = pipeline(task="text-generation", model=model, tokenizer=tokenizer)
p.save_pretrained("text-generation-model-artefacts")
if __name__ == "__main__":
print("Building a custom GPT2 HuggingFace model...")
main()
Key Parameters
| Parameter | Value | Description |
|---|---|---|
| model identifier | "gpt2" |
HuggingFace Hub model ID for the GPT-2 base model (124M parameters) |
| task | "text-generation" |
Pipeline task type; determines how input/output is processed |
| save path | "text-generation-model-artefacts" |
Local directory where serialized pipeline artifacts are written |
API Methods
| Method | Class | Purpose |
|---|---|---|
from_pretrained("gpt2") |
GPT2Tokenizer | Downloads tokenizer vocabulary and merges from HuggingFace Hub |
from_pretrained("gpt2") |
TFGPT2LMHeadModel | Downloads TensorFlow model weights from HuggingFace Hub |
pipeline(task, model, tokenizer) |
transformers | Creates a high-level inference pipeline combining model and tokenizer |
save_pretrained(path) |
Pipeline | Serializes the full pipeline (model, tokenizer, config) to a directory |
I/O Contract
Inputs
| Input | Source | Description |
|---|---|---|
| HuggingFace model hub | Network (https://huggingface.co) | Downloads GPT-2 tokenizer vocabulary files and TensorFlow model weights |
Outputs
| Output | Format | Description |
|---|---|---|
text-generation-model-artefacts/ |
Directory | Contains serialized tokenizer files, model weights (tf_model.h5), config.json, and tokenizer_config.json
|
The output directory structure typically includes:
config.json-- model architecture configurationtf_model.h5-- TensorFlow model weightsvocab.json-- tokenizer vocabularymerges.txt-- BPE merge rulesspecial_tokens_map.json-- special token definitionstokenizer_config.json-- tokenizer and task metadata
Usage Examples
Running the training script
cd samples/scripts/models/huggingface-text-gen-gpt2
python train.py
Uploading artifacts to GCS
After serialization, upload the artifacts to Google Cloud Storage for use with Seldon Core 2:
gsutil cp -r text-generation-model-artefacts gs://seldon-models/mlserver/huggingface/text-gen
Adapting for other model types
To prepare a sentiment analysis model instead of text generation:
from transformers import pipeline
p = pipeline(task="sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
p.save_pretrained("sentiment-model-artefacts")
Related Pages
- SeldonIO_Seldon_core_HuggingFace_Model_Preparation -- principle that this implementation realizes
- SeldonIO_Seldon_core_Seldon_Model_CRD_HuggingFace -- consumed by the Model CRD that references the serialized artifacts
- SeldonIO_Seldon_core_HuggingFace_Model_Resource_Definition -- feeds into the model resource definition that declares the stored artifacts
- Environment:SeldonIO_Seldon_core_Python_ML_Dependencies_Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment