Principle:BerriAI Litellm Fine Tuning Job Creation
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| Transfer Learning Theory, OpenAI Fine-Tuning API, Multi-Provider Abstraction Patterns | Machine Learning, API Design, Model Customization | 2026-02-15 |
Overview
Fine-tuning job creation is the process of submitting a request to an LLM provider to begin training a customized model from a base model using previously uploaded training data.
Description
Once training data has been prepared and uploaded, the next step in the fine-tuning workflow is creating a job that instructs the provider to begin the training process. Fine-tuning job creation takes a base model identifier, a training file reference, optional hyperparameters, and optional metadata, then submits them to the provider's fine-tuning API endpoint. The provider enqueues the job, begins training asynchronously, and returns a job object containing a unique identifier, status, and metadata.
The key challenge in a multi-provider environment is that each provider (OpenAI, Azure OpenAI, Vertex AI) has different API endpoints, authentication mechanisms, and request formats. A unified job creation abstraction must:
- Route the request to the correct provider based on a provider identifier.
- Resolve API credentials through a fallback chain (explicit parameters, global config, environment variables).
- Transform a common input schema into provider-specific request formats.
- Normalize the provider-specific response into a unified job object.
- Support both synchronous and asynchronous execution patterns.
Usage
Fine-tuning job creation should be performed when:
- A training file has been uploaded and its file ID is available.
- The base model to customize has been selected.
- Hyperparameters have been determined (or defaults are acceptable).
- The application needs to initiate model customization programmatically.
- Cross-provider portability is desired, allowing the same code to target different providers.
Theoretical Basis
Job Creation Flow
The job creation process follows a well-defined pipeline:
FUNCTION create_fine_tuning_job(model, training_file, hyperparameters, provider):
1. VALIDATE inputs:
a. model must be a non-empty string
b. training_file must be a valid file ID from a prior upload
c. hyperparameters (if provided) must have valid types
2. CONSTRUCT typed hyperparameters object from raw dict
3. RESOLVE provider credentials:
a. api_base from params -> globals -> environment
b. api_key from params -> globals -> environment
c. Additional provider-specific values (api_version, project, location)
4. CONSTRUCT job creation payload:
a. SET model, training_file, hyperparameters
b. SET optional fields: suffix, validation_file, integrations, seed
c. SERIALIZE payload, excluding None values
5. DISPATCH to provider handler:
IF provider == "openai":
response = openai_handler.create(payload, credentials)
ELSE IF provider == "azure":
response = azure_handler.create(payload, credentials)
ELSE IF provider == "vertex_ai":
response = vertex_handler.create(payload, credentials)
ELSE:
RAISE unsupported provider error
6. RETURN normalized job object with id, status, model, created_at
Provider Routing Pattern
The provider routing pattern uses a conditional dispatch mechanism where the custom_llm_provider parameter determines which internal handler processes the request. Each handler encapsulates the provider-specific logic:
- OpenAI: Standard REST call to
/v1/fine_tuning/jobswith Bearer token authentication. - Azure OpenAI: REST call with
api-versionquery parameter and Azure-specific authentication (API key or AD token). - Vertex AI: Google Cloud authenticated call with project and location scoping, using service account credentials.
Timeout Management
Fine-tuning job creation requests have a default timeout of 600 seconds (10 minutes). The timeout resolution follows a priority chain:
- Explicitly provided timeout parameter
- Request timeout from kwargs
- Default of 600 seconds
The timeout applies to the HTTP request to create the job, not to the job execution itself. The actual fine-tuning process runs asynchronously on the provider's infrastructure and may take minutes to hours.
Synchronous vs Asynchronous Execution
The pattern supports both calling conventions through a shared core implementation:
SYNCHRONOUS PATH:
caller -> create_fine_tuning_job() -> provider_handler -> return result
ASYNCHRONOUS PATH:
caller -> acreate_fine_tuning_job()
-> copy context
-> schedule create_fine_tuning_job() on thread executor
-> await result
-> if result is coroutine, await again
-> return result
This design avoids code duplication by reusing the synchronous implementation within an async wrapper.