Principle:BerriAI Litellm Fine Tuned Model Usage

Knowledge Sources	Domains	Last Updated
Transfer Learning Deployment, Model Serving Patterns, Unified LLM Interface Design	Machine Learning Operations, API Design, Model Deployment	2026-02-15

Overview

Fine-tuned model usage is the practice of invoking a customized model through the same unified completion interface used for base models, using provider-specific model identifiers obtained from successful fine-tuning jobs.

Description

After a fine-tuning job completes successfully, the provider assigns a unique model identifier to the resulting fine-tuned model (e.g., ft:gpt-3.5-turbo:my-org:custom-suffix:abc123 for OpenAI). This identifier can then be used in place of a standard model name when making completion requests. The fundamental design principle is that fine-tuned models are transparent replacements for base models: the caller uses the exact same completion interface, with the only difference being the model identifier string.

This transparency provides several advantages:

Zero code changes: Existing application code that calls the completion API works with fine-tuned models by simply changing the model parameter.
Unified interface: The same function handles routing, authentication, streaming, function calling, and all other features regardless of whether the model is a base model or a fine-tuned variant.
Provider abstraction: The caller specifies the provider prefix (e.g., openai/, azure/) followed by the fine-tuned model identifier, and the framework handles all provider-specific details.

The key challenge is correctly identifying the provider for a fine-tuned model identifier, since fine-tuned model names may follow provider-specific naming conventions that differ from standard model names.

Usage

Fine-tuned model usage applies when:

A fine-tuning job has completed successfully and produced a model identifier.
The fine-tuned model needs to be integrated into existing application workflows.
Multiple fine-tuned models need to be compared against each other or against base models.
A production system needs to switch between base and fine-tuned models based on configuration.
Load balancing or fallback logic needs to include fine-tuned models alongside base models.

Theoretical Basis

Model Identifier Resolution

When a completion request is made with a fine-tuned model identifier, the framework must resolve which provider to route the request to. This resolution follows a pattern:

FUNCTION resolve_model_provider(model_identifier):
    IF model_identifier contains explicit provider prefix (e.g., "openai/ft:..."):
        EXTRACT provider from prefix
        EXTRACT model_name from remainder
    ELSE IF model_identifier matches known fine-tuned naming pattern:
        OpenAI: starts with "ft:" or "ft-"
        Azure: deployment name format
        Vertex AI: tuned model resource path
        INFER provider from pattern
    ELSE:
        FALL BACK to default provider resolution logic
    RETURN (provider, model_name)

Completion Flow with Fine-Tuned Models

The completion flow is identical to base model usage, with the fine-tuned model identifier simply substituted:

FUNCTION completion_with_fine_tuned_model(model_id, messages, params):
    1. RESOLVE provider from model_id
    2. RESOLVE credentials for provider
    3. TRANSFORM messages to provider-specific format
    4. SEND request to provider completion endpoint with model_id
    5. RECEIVE response
    6. NORMALIZE response to unified format (ModelResponse)
    7. RETURN response

NOTE: Steps 1-7 are identical to base model completion.
      The only difference is the value of model_id.

Provider-Specific Model Identifiers

Each provider uses a different naming convention for fine-tuned models:

Provider	Fine-Tuned Model Identifier Pattern	Example
OpenAI	`ft:{base_model}:{org}:{suffix}:{id}`	`ft:gpt-3.5-turbo:my-org:custom:abc123`
Azure OpenAI	Deployment name (user-defined)	`my-fine-tuned-deployment`
Vertex AI	Tuned model resource path	`projects/{project}/locations/{location}/models/{model_id}`

Router Integration

Fine-tuned models integrate seamlessly with LiteLLM's router system for load balancing and fallback:

FUNCTION configure_router_with_fine_tuned_models():
    model_list = [
        {
            "model_name": "my-fine-tuned-model",
            "litellm_params": {
                "model": "ft:gpt-3.5-turbo:my-org:custom:abc123",
                "api_key": "...",
            }
        },
        {
            "model_name": "my-fine-tuned-model",
            "litellm_params": {
                "model": "azure/my-ft-deployment",
                "api_base": "...",
                "api_key": "...",
            }
        },
    ]
    router = Router(model_list=model_list)
    response = router.completion(
        model="my-fine-tuned-model",
        messages=[...]
    )
    RETURN response

This allows fine-tuned models from different providers to be grouped under a single logical model name with automatic failover.

Related Pages

Implementation:BerriAI_Litellm_Completion_With_Fine_Tuned_Model

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment