Workflow:BerriAI Litellm Fine Tuning Job
| Knowledge Sources | |
|---|---|
| Domains | LLM_Ops, Fine_Tuning, Model_Training |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
End-to-end process for creating and managing LLM fine-tuning jobs across providers through LiteLLM's unified fine-tuning API.
Description
This workflow covers the use of LiteLLM's unified fine-tuning API to create, monitor, and manage fine-tuning jobs across multiple LLM providers (OpenAI, Azure OpenAI, Vertex AI). The API follows OpenAI's fine-tuning interface while routing to the appropriate provider handler. It supports uploading training data, configuring hyperparameters, tracking job progress, and using the resulting fine-tuned model for inference.
Key outputs:
- A fine-tuned model registered with the provider, ready for inference
- Job status tracking with event logs
- Support for custom hyperparameters (learning rate, epochs, batch size)
- Cross-provider compatibility for fine-tuning operations
Usage
Execute this workflow when you have domain-specific training data and need to fine-tune a base LLM to improve performance on your particular task. This is appropriate when prompt engineering alone is insufficient and you need the model to learn from examples in your dataset.
Execution Steps
Step 1: Training Data Preparation
Prepare training data in JSONL format following the provider's expected schema. For chat models, each line contains a messages array with role/content pairs representing a complete conversation example. Optionally prepare a separate validation dataset for evaluating training progress.
Key considerations:
- OpenAI expects JSONL with
messagesarrays for chat fine-tuning - Each example should represent a complete, high-quality interaction
- Validation data helps monitor overfitting during training
- Data quality directly impacts fine-tuning effectiveness
Step 2: Training File Upload
Upload the training data file using litellm.create_file() with purpose="fine-tune". This routes to the appropriate provider's file upload API and returns a file object with an ID. The file ID is used in the subsequent fine-tuning job creation.
Key considerations:
- The file upload API supports OpenAI, Azure, Bedrock, and Vertex AI providers
- File IDs are provider-specific and must be used with the same provider
- Large files may take time to process after upload
- Validation files are uploaded separately with the same purpose
Step 3: Fine_Tuning Job Creation
Create a fine-tuning job using litellm.create_fine_tuning_job() with the base model, training file ID, and optional hyperparameters. The function routes to the correct provider based on the custom_llm_provider parameter and returns a job object with a unique job ID and initial status.
Key considerations:
modelspecifies the base model to fine-tune (e.g.,gpt-4o-mini-2024-07-18)training_fileis the file ID from the upload stephyperparameterscan includen_epochs,learning_rate_multiplier,batch_size- The
suffixparameter adds a custom identifier to the fine-tuned model name - Job creation is asynchronous; training happens in the background
Step 4: Job Progress Monitoring
Monitor the fine-tuning job progress by polling job events with litellm.list_fine_tuning_events() and checking job status. Events include training loss metrics, validation results, and completion notifications. The job transitions through states: validating_files, queued, running, succeeded or failed.
Key considerations:
- Events provide training loss and validation metrics at each step
- Job status can be checked via
retrieve_fine_tuning_job() - Training duration depends on dataset size, model size, and number of epochs
- Failed jobs include error messages describing the failure reason
Step 5: Fine_Tuned Model Usage
Once the job succeeds, the provider returns a fine-tuned model identifier that can be used in regular completion() calls. The fine-tuned model retains the base model's capabilities while incorporating learned patterns from the training data.
Key considerations:
- The fine-tuned model ID is available in the completed job object
- Use the model ID with the same provider prefix for inference
- Fine-tuned models have the same context window and capabilities as the base model
- Pricing for fine-tuned models typically differs from the base model