Workflow:BerriAI Litellm Fine Tuning Job

Knowledge Sources	LiteLLM LiteLLM Docs
Domains	LLM_Ops, Fine_Tuning, Model_Training
Last Updated	2026-02-15 16:00 GMT

Overview

End-to-end process for creating and managing LLM fine-tuning jobs across providers through LiteLLM's unified fine-tuning API.

Description

This workflow covers the use of LiteLLM's unified fine-tuning API to create, monitor, and manage fine-tuning jobs across multiple LLM providers (OpenAI, Azure OpenAI, Vertex AI). The API follows OpenAI's fine-tuning interface while routing to the appropriate provider handler. It supports uploading training data, configuring hyperparameters, tracking job progress, and using the resulting fine-tuned model for inference.

Key outputs:

A fine-tuned model registered with the provider, ready for inference
Job status tracking with event logs
Support for custom hyperparameters (learning rate, epochs, batch size)
Cross-provider compatibility for fine-tuning operations

Usage

Execute this workflow when you have domain-specific training data and need to fine-tune a base LLM to improve performance on your particular task. This is appropriate when prompt engineering alone is insufficient and you need the model to learn from examples in your dataset.

Execution Steps

Step 1: Training Data Preparation

Prepare training data in JSONL format following the provider's expected schema. For chat models, each line contains a messages array with role/content pairs representing a complete conversation example. Optionally prepare a separate validation dataset for evaluating training progress.

Key considerations:

OpenAI expects JSONL with messages arrays for chat fine-tuning
Each example should represent a complete, high-quality interaction
Validation data helps monitor overfitting during training
Data quality directly impacts fine-tuning effectiveness

Step 2: Training File Upload

Upload the training data file using litellm.create_file() with purpose="fine-tune". This routes to the appropriate provider's file upload API and returns a file object with an ID. The file ID is used in the subsequent fine-tuning job creation.

Key considerations:

The file upload API supports OpenAI, Azure, Bedrock, and Vertex AI providers
File IDs are provider-specific and must be used with the same provider
Large files may take time to process after upload
Validation files are uploaded separately with the same purpose

Step 3: Fine_Tuning Job Creation

Create a fine-tuning job using litellm.create_fine_tuning_job() with the base model, training file ID, and optional hyperparameters. The function routes to the correct provider based on the custom_llm_provider parameter and returns a job object with a unique job ID and initial status.

Key considerations:

model specifies the base model to fine-tune (e.g., gpt-4o-mini-2024-07-18)
training_file is the file ID from the upload step
hyperparameters can include n_epochs, learning_rate_multiplier, batch_size
The suffix parameter adds a custom identifier to the fine-tuned model name
Job creation is asynchronous; training happens in the background

Step 4: Job Progress Monitoring

Monitor the fine-tuning job progress by polling job events with litellm.list_fine_tuning_events() and checking job status. Events include training loss metrics, validation results, and completion notifications. The job transitions through states: validating_files, queued, running, succeeded or failed.

Key considerations:

Events provide training loss and validation metrics at each step
Job status can be checked via retrieve_fine_tuning_job()
Training duration depends on dataset size, model size, and number of epochs
Failed jobs include error messages describing the failure reason

Step 5: Fine_Tuned Model Usage

Once the job succeeds, the provider returns a fine-tuned model identifier that can be used in regular completion() calls. The fine-tuned model retains the base model's capabilities while incorporating learned patterns from the training data.

Key considerations:

The fine-tuned model ID is available in the completed job object
Use the model ID with the same provider prefix for inference
Fine-tuned models have the same context window and capabilities as the base model
Pricing for fine-tuned models typically differs from the base model

Execution Diagram

GitHub URL

Workflow Repository