Implementation:PacktPublishing LLM Engineers Handbook Run Finetuning On Sagemaker
| Field | Value |
|---|---|
| Implementation Name | Run Finetuning On Sagemaker |
| Type | API Doc |
| Source File | llm_engineering/model/finetuning/sagemaker.py:L17-69 |
| Workflow | LLM_Finetuning |
| Repo | PacktPublishing/LLM-Engineers-Handbook |
| Implements | Principle:PacktPublishing_LLM_Engineers_Handbook_SageMaker_Training_Orchestration |
Function Signature
def run_finetuning_on_sagemaker(
finetuning_type: str,
num_train_epochs: int,
per_device_train_batch_size: int,
learning_rate: float,
dataset_huggingface_workspace: str,
is_dummy: bool,
) -> None
Import
from llm_engineering.model.finetuning.sagemaker import run_finetuning_on_sagemaker
Description
This function orchestrates the submission of an LLM fine-tuning job to AWS SageMaker. It constructs a HuggingFace Estimator with all necessary configuration -- instance type, hyperparameters, dependencies, and entry point -- then calls .fit() to launch the managed training job.
The function does not perform any training itself; it delegates execution to SageMaker, which provisions a GPU instance, sets up the container, and runs the finetune.py entry point script.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
finetuning_type |
str |
"sft" |
Type of fine-tuning to perform. Either "sft" (Supervised Fine-Tuning) or "dpo" (Direct Preference Optimization).
|
num_train_epochs |
int |
3 |
Number of training epochs. |
per_device_train_batch_size |
int |
2 |
Batch size per GPU device. |
learning_rate |
float |
3e-4 |
Learning rate for the optimizer. |
dataset_huggingface_workspace |
str |
— | HuggingFace workspace containing the training dataset. |
is_dummy |
bool |
False |
If True, runs a minimal training job for testing purposes.
|
Returns
None -- The function submits the job and blocks until completion. Model artifacts are saved to S3 by SageMaker.
Key Implementation Details
SageMaker Estimator Configuration
from sagemaker.huggingface import HuggingFace
huggingface_estimator = HuggingFace(
entry_point="finetune.py",
source_dir=str(Path(__file__).resolve().parent),
instance_type="ml.g5.2xlarge",
instance_count=1,
transformers_version="4.36",
pytorch_version="2.1",
py_version="py310",
hyperparameters={
"finetuning_type": finetuning_type,
"num_train_epochs": num_train_epochs,
"per_device_train_batch_size": per_device_train_batch_size,
"learning_rate": learning_rate,
"dataset_huggingface_workspace": dataset_huggingface_workspace,
"is_dummy": is_dummy,
},
role=settings.AWS_ARN_ROLE,
environment={
"HUGGING_FACE_HUB_TOKEN": settings.HUGGINGFACE_ACCESS_TOKEN,
"COMET_API_KEY": settings.COMET_API_KEY,
"COMET_PROJECT": settings.COMET_PROJECT,
"COMET_WORKSPACE": settings.COMET_WORKSPACE,
},
)
huggingface_estimator.fit()
Key Aspects
- Instance type:
ml.g5.2xlargeprovides an NVIDIA A10G GPU with 24GB VRAM. - Entry point:
finetune.pyis the script that runs inside the SageMaker container. - Source directory: The entire
finetuning/directory is packaged and uploaded to the container. - Environment variables: HuggingFace tokens, Comet ML keys are passed securely via environment variables.
- Hyperparameters: Passed as a dictionary and injected as command-line arguments to the entry point.
External Dependencies
| Package | Purpose |
|---|---|
sagemaker |
AWS SageMaker Python SDK for job submission |
huggingface_hub |
Model/dataset access tokens |
loguru |
Structured logging |
Usage Example
from llm_engineering.model.finetuning.sagemaker import run_finetuning_on_sagemaker
# Launch an SFT fine-tuning job on SageMaker
run_finetuning_on_sagemaker(
finetuning_type="sft",
num_train_epochs=3,
per_device_train_batch_size=2,
learning_rate=3e-4,
dataset_huggingface_workspace="my-hf-workspace",
is_dummy=False,
)