Principle:Run llama Llama index Finetuning Job Launch
Overview
Finetuning Job Launch is the step in the LLM finetuning workflow where validated training data is uploaded to the OpenAI platform and a finetuning job is created. This process bridges the gap between local data preparation and cloud-based model training. LlamaIndex's OpenAIFinetuneEngine encapsulates the entire launch sequence -- validation, file upload, and job creation -- into a single finetune() call, abstracting away the multi-step OpenAI API interaction.
The finetuning engine follows the engine pattern common in LlamaIndex, where a stateful object manages the lifecycle of an asynchronous operation. Once configured with a base model and data path, the engine handles all API communication, retry logic, and state tracking internally.
OpenAI Finetuning API Flow
Launching a finetuning job involves three sequential API interactions:
| Step | API Call | Purpose |
|---|---|---|
| 1 | client.files.create(file, purpose="fine-tune") |
Upload the JSONL training file to OpenAI's servers |
| 2 | Wait for file processing | OpenAI processes and validates the uploaded file |
| 3 | client.fine_tuning.jobs.create(training_file, model) |
Create the finetuning job using the uploaded file and base model |
The file upload returns a file ID that is then referenced when creating the finetuning job. There can be a delay between upload completion and the file being ready for use, which the engine handles with automatic retry logic.
Base Model Selection
The base_model parameter determines which pretrained model serves as the starting point for finetuning:
- gpt-3.5-turbo: The most common choice for finetuning; cost-effective and fast to train
- gpt-4o-mini: A smaller GPT-4 variant available for finetuning
- Custom model IDs: Previously finetuned models can be further refined through iterative finetuning
The choice of base model affects training cost, inference cost, and the quality ceiling of the finetuned model. A common pattern is to use GPT-4 as the teacher to generate training data and GPT-3.5-turbo as the student base model for finetuning, achieving near-GPT-4 quality at GPT-3.5 cost.
Engine Construction Patterns
LlamaIndex provides two ways to construct the finetuning engine:
Direct Construction
When you already have a JSONL training file on disk:
from llama_index.finetuning import OpenAIFinetuneEngine
engine = OpenAIFinetuneEngine(
base_model="gpt-3.5-turbo",
data_path="training_data.jsonl",
verbose=True,
validate_json=True,
)
engine.finetune()
From Finetuning Handler
When collecting data from live pipeline interactions, the handler can directly feed into the engine:
from llama_index.finetuning import OpenAIFinetuneEngine
engine = OpenAIFinetuneEngine.from_finetuning_handler(
finetuning_handler=finetuning_handler,
base_model="gpt-3.5-turbo",
data_path="training_data.jsonl",
)
engine.finetune()
This classmethod first calls save_finetuning_events(data_path) on the handler, then constructs the engine with the saved file path.
Retry Logic
The job creation step includes built-in retry logic for handling the common case where the uploaded file is not yet processed:
while True:
try:
job_output = client.fine_tuning.jobs.create(
training_file=output.id, model=self.base_model
)
break
except openai.BadRequestError:
print("Waiting for file to be ready...")
time.sleep(60)
This retry loop sleeps for 60 seconds between attempts, waiting for OpenAI to finish processing the uploaded file.
Resuming Existing Jobs
The engine supports reconnecting to a previously launched job via the start_job_id parameter:
engine = OpenAIFinetuneEngine(
base_model="gpt-3.5-turbo",
data_path="training_data.jsonl",
start_job_id="ftjob-abc123",
)
# Can now call get_current_job() and get_finetuned_model() without calling finetune()
This is useful for long-running jobs where the original process may have been terminated, or for monitoring jobs from a different environment.
Key Considerations
- API key requirement: The engine reads
OPENAI_API_KEYfrom the environment variable; ensure it is set before constructing the engine - Validation toggle: Data validation runs by default before upload but can be disabled with
validate_json=Falsefor pre-validated datasets - Job tracking: After
finetune()completes, the job object is stored internally and accessible viaget_current_job() - Email notification: OpenAI sends an email when the finetuning job completes, so polling is optional