Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Run llama Llama index Finetuning Job Launch

From Leeroopedia

Overview

Finetuning Job Launch is the step in the LLM finetuning workflow where validated training data is uploaded to the OpenAI platform and a finetuning job is created. This process bridges the gap between local data preparation and cloud-based model training. LlamaIndex's OpenAIFinetuneEngine encapsulates the entire launch sequence -- validation, file upload, and job creation -- into a single finetune() call, abstracting away the multi-step OpenAI API interaction.

The finetuning engine follows the engine pattern common in LlamaIndex, where a stateful object manages the lifecycle of an asynchronous operation. Once configured with a base model and data path, the engine handles all API communication, retry logic, and state tracking internally.

OpenAI Finetuning API Flow

Launching a finetuning job involves three sequential API interactions:

Step API Call Purpose
1 client.files.create(file, purpose="fine-tune") Upload the JSONL training file to OpenAI's servers
2 Wait for file processing OpenAI processes and validates the uploaded file
3 client.fine_tuning.jobs.create(training_file, model) Create the finetuning job using the uploaded file and base model

The file upload returns a file ID that is then referenced when creating the finetuning job. There can be a delay between upload completion and the file being ready for use, which the engine handles with automatic retry logic.

Base Model Selection

The base_model parameter determines which pretrained model serves as the starting point for finetuning:

  • gpt-3.5-turbo: The most common choice for finetuning; cost-effective and fast to train
  • gpt-4o-mini: A smaller GPT-4 variant available for finetuning
  • Custom model IDs: Previously finetuned models can be further refined through iterative finetuning

The choice of base model affects training cost, inference cost, and the quality ceiling of the finetuned model. A common pattern is to use GPT-4 as the teacher to generate training data and GPT-3.5-turbo as the student base model for finetuning, achieving near-GPT-4 quality at GPT-3.5 cost.

Engine Construction Patterns

LlamaIndex provides two ways to construct the finetuning engine:

Direct Construction

When you already have a JSONL training file on disk:

from llama_index.finetuning import OpenAIFinetuneEngine

engine = OpenAIFinetuneEngine(
    base_model="gpt-3.5-turbo",
    data_path="training_data.jsonl",
    verbose=True,
    validate_json=True,
)
engine.finetune()

From Finetuning Handler

When collecting data from live pipeline interactions, the handler can directly feed into the engine:

from llama_index.finetuning import OpenAIFinetuneEngine

engine = OpenAIFinetuneEngine.from_finetuning_handler(
    finetuning_handler=finetuning_handler,
    base_model="gpt-3.5-turbo",
    data_path="training_data.jsonl",
)
engine.finetune()

This classmethod first calls save_finetuning_events(data_path) on the handler, then constructs the engine with the saved file path.

Retry Logic

The job creation step includes built-in retry logic for handling the common case where the uploaded file is not yet processed:

while True:
    try:
        job_output = client.fine_tuning.jobs.create(
            training_file=output.id, model=self.base_model
        )
        break
    except openai.BadRequestError:
        print("Waiting for file to be ready...")
        time.sleep(60)

This retry loop sleeps for 60 seconds between attempts, waiting for OpenAI to finish processing the uploaded file.

Resuming Existing Jobs

The engine supports reconnecting to a previously launched job via the start_job_id parameter:

engine = OpenAIFinetuneEngine(
    base_model="gpt-3.5-turbo",
    data_path="training_data.jsonl",
    start_job_id="ftjob-abc123",
)
# Can now call get_current_job() and get_finetuned_model() without calling finetune()

This is useful for long-running jobs where the original process may have been terminated, or for monitoring jobs from a different environment.

Key Considerations

  • API key requirement: The engine reads OPENAI_API_KEY from the environment variable; ensure it is set before constructing the engine
  • Validation toggle: Data validation runs by default before upload but can be disabled with validate_json=False for pre-validated datasets
  • Job tracking: After finetune() completes, the job object is stored internally and accessible via get_current_job()
  • Email notification: OpenAI sends an email when the finetuning job completes, so polling is optional

Knowledge Sources

Metadata

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment