Implementation:Run llama Llama index OpenAIFinetuneEngine Get Finetuned Model
Overview
The get_finetuned_model method on OpenAIFinetuneEngine retrieves the completed finetuning job, extracts the finetuned model ID, and returns a fully configured OpenAI LLM instance ready for use in LlamaIndex pipelines. This method bridges the gap between a completed cloud training job and a usable local LLM object.
Source File
- File:
llama-index-finetuning/llama_index/finetuning/openai/base.py - Lines: 104-119
- Import: Accessed as a method on
OpenAIFinetuneEngineinstances
Method Signature
def get_finetuned_model(self, **model_kwargs: Any) -> LLM:
"""Gets finetuned model."""
Parameters:
| Parameter | Type | Description |
|---|---|---|
**model_kwargs |
Any |
Keyword arguments passed to the OpenAI LLM constructor (e.g., temperature, max_tokens)
|
Returns: LLM -- An OpenAI LLM instance configured with the finetuned model ID
Raises:
ValueError-- If the job does not have a finetuned model ID ready (model still training or not started)ValueError-- If the job status is not"succeeded"
Implementation Detail
def get_finetuned_model(self, **model_kwargs: Any) -> LLM:
"""Gets finetuned model."""
current_job = self.get_current_job()
job_id = current_job.id
status = current_job.status
model_id = current_job.fine_tuned_model
if model_id is None:
raise ValueError(
f"Job {job_id} does not have a finetuned model id ready yet."
)
if status != "succeeded":
raise ValueError(f"Job {job_id} has status {status}, cannot get model")
return OpenAI(model=model_id, **model_kwargs)
The method performs the following steps:
- Retrieve current job: Calls
self.get_current_job()to fetch the latest job state from the OpenAI API - Extract metadata: Reads
id,status, andfine_tuned_modelfrom the job object - Validate model availability: Checks that
fine_tuned_modelis notNone. This attribute is only populated when training completes successfully. - Validate job success: Confirms that the status is
"succeeded". A job could theoretically have a model ID from a partial run but not be in a succeeded state. - Create LLM instance: Returns
OpenAI(model=model_id, **model_kwargs), a standard LlamaIndex LLM wrapper that can be used anywhere anLLMis expected
Return Type
The returned object is an instance of llama_index.llms.openai.OpenAI, which implements the full LLM interface:
complete(prompt)-- Single completionchat(messages)-- Chat completionstream_complete(prompt)-- Streaming completionstream_chat(messages)-- Streaming chat completionacomplete(prompt)-- Async completionachat(messages)-- Async chat completion
The finetuned model behaves identically to any other OpenAI model within LlamaIndex, just with domain-specific learned behavior from the training data.
Usage Example
from llama_index.finetuning import OpenAIFinetuneEngine
# After finetuning job has completed
engine = OpenAIFinetuneEngine(
base_model="gpt-3.5-turbo",
data_path="training_data.jsonl",
start_job_id="ftjob-abc123",
)
# Get the finetuned model with custom inference parameters
ft_llm = engine.get_finetuned_model(temperature=0.3)
# Use it like any OpenAI model
response = ft_llm.complete("Explain vector similarity search.")
print(response)
# Use in chat mode
from llama_index.core.base.llms.types import ChatMessage
messages = [
ChatMessage(role="user", content="What is semantic search?")
]
chat_response = ft_llm.chat(messages)
print(chat_response.message.content)