Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Togethercomputer Together python FineTuning Download

From Leeroopedia
Attribute Value
Implementation Name FineTuning_Download
Type API Method
Source src/together/resources/finetune.py:L777-867
Domain MLOps, Fine_Tuning
Repository togethercomputer/together-python
Last Updated 2026-02-15 16:00 GMT

API Signature

class FineTuning:
    def download(
        self,
        id: str,
        *,
        output: Path | str | None = None,
        checkpoint_step: int | None = None,
        checkpoint_type: DownloadCheckpointType | str = DownloadCheckpointType.DEFAULT,
    ) -> FinetuneDownloadResult:

Import

from together import Together

client = Together()
result = client.fine_tuning.download(id="ft-...")

I/O Contract

Inputs

Parameter Type Default Description
id str (required) Fine-tune job ID (starts with "ft-"). Also supports the compound format "ft-id:step" to specify a checkpoint step inline.
output str | None None Local output file path. Defaults to $PWD/{model_name}.{extension}.
checkpoint_step None None Step number of the checkpoint to download. None downloads the final model.
checkpoint_type str DownloadCheckpointType.DEFAULT Type of checkpoint to download. Values: "default", "merged", "adapter".

Note on the compound ID format: If id matches the pattern ft-[hex-uuid]:digit, the step is automatically extracted and checkpoint_step must not be separately specified. Providing both raises a ValueError.

Output

Returns a FinetuneDownloadResult object:

Field Type Description
id str The fine-tune job ID (without the step suffix).
checkpoint_step None The checkpoint step that was downloaded, or None for the final model.
filename str Local file path where the model was saved.
size int Downloaded file size in bytes.

Errors

  • ValueError -- When both the compound ID format (with :step) and checkpoint_step are provided.
  • ValueError -- When an invalid checkpoint_type string is given (must be one of "default", "merged", "adapter").
  • ValueError -- When checkpoint_type is not "default" for a full training job (only default is allowed).
  • ValueError -- When an invalid checkpoint_type is specified for a LoRA training job.

Code Reference

From src/together/resources/finetune.py:L777-867:

def download(
    self,
    id: str,
    *,
    output: Path | str | None = None,
    checkpoint_step: int | None = None,
    checkpoint_type: DownloadCheckpointType | str = DownloadCheckpointType.DEFAULT,
) -> FinetuneDownloadResult:
    # Parse compound ID format "ft-id:step"
    if re.match(_FT_JOB_WITH_STEP_REGEX, id) is not None:
        if checkpoint_step is None:
            checkpoint_step = int(id.split(":")[1])
            id = id.split(":")[0]
        else:
            raise ValueError(
                "Fine-tuning job ID {id} contains a colon to specify the step "
                "to download, but `checkpoint_step` was also set."
            )

    url = f"finetune/download?ft_id={id}"

    if checkpoint_step is not None:
        url += f"&checkpoint_step={checkpoint_step}"

    ft_job = self.retrieve(id)

    # Convert string to DownloadCheckpointType enum
    if isinstance(checkpoint_type, str):
        try:
            checkpoint_type = DownloadCheckpointType(checkpoint_type.lower())
        except ValueError:
            enum_strs = ", ".join(e.value for e in DownloadCheckpointType)
            raise ValueError(
                f"Invalid checkpoint type: {checkpoint_type}. "
                f"Choose one of {{{enum_strs}}}."
            )

    # Route based on training type
    if isinstance(ft_job.training_type, FullTrainingType):
        if checkpoint_type != DownloadCheckpointType.DEFAULT:
            raise ValueError(
                "Only DEFAULT checkpoint type is allowed for FullTrainingType"
            )
        url += "&checkpoint=model_output_path"
    elif isinstance(ft_job.training_type, LoRATrainingType):
        if checkpoint_type == DownloadCheckpointType.DEFAULT:
            checkpoint_type = DownloadCheckpointType.MERGED
        if checkpoint_type in {
            DownloadCheckpointType.MERGED,
            DownloadCheckpointType.ADAPTER,
        }:
            url += f"&checkpoint={checkpoint_type.value}"
        else:
            raise ValueError(
                f"Invalid checkpoint type for LoRATrainingType: {checkpoint_type}"
            )

    remote_name = ft_job.output_name
    download_manager = DownloadManager(self._client)

    if isinstance(output, str):
        output = Path(output)

    downloaded_filename, file_size = download_manager.download(
        url, output, normalize_key(remote_name or id), fetch_metadata=True
    )

    return FinetuneDownloadResult(
        object="local",
        id=id,
        checkpoint_step=checkpoint_step,
        filename=downloaded_filename,
        size=file_size,
    )

Key implementation details:

  • The regex _FT_JOB_WITH_STEP_REGEX = r"^ft-[\dabcdef-]+:\d+$" is used to detect the compound "ft-id:step" format.
  • The method first retrieves the job details via self.retrieve(id) to determine the training type (LoRA vs Full) and the output model name.
  • For LoRA jobs, "default" is silently upgraded to "merged".
  • The DownloadCheckpointType enum has three values: DEFAULT = "default", MERGED = "merged", ADAPTER = "adapter".
  • The actual file transfer is handled by DownloadManager from together.filemanager.

Usage Examples

Download Final Model (Default)

from together import Together

client = Together()

result = client.fine_tuning.download("ft-12345678-abcd-1234-efgh-123456789012")
print(f"Downloaded to: {result.filename}")
print(f"Size: {result.size} bytes")

Download Specific Checkpoint Step

from together import Together

client = Together()

# Using checkpoint_step parameter
result = client.fine_tuning.download(
    "ft-12345678-abcd-1234-efgh-123456789012",
    checkpoint_step=500,
)
print(f"Checkpoint step {result.checkpoint_step}: {result.filename}")

Download Using Compound ID Format

from together import Together

client = Together()

# Step embedded in the ID
result = client.fine_tuning.download("ft-12345678-abcd-1234-efgh-123456789012:500")
print(f"Checkpoint step {result.checkpoint_step}: {result.filename}")

Download LoRA Adapter Only

from together import Together

client = Together()

# Download just the LoRA adapter (smaller file)
result = client.fine_tuning.download(
    "ft-12345678-abcd-1234-efgh-123456789012",
    checkpoint_type="adapter",
)
print(f"Adapter downloaded to: {result.filename}")
print(f"Adapter size: {result.size} bytes")

Download Merged LoRA Model to Custom Path

from together import Together

client = Together()

result = client.fine_tuning.download(
    "ft-12345678-abcd-1234-efgh-123456789012",
    output="/models/my-finetuned-model.tar.gz",
    checkpoint_type="merged",
)
print(f"Merged model saved to: {result.filename}")

End-to-End: Monitor then Download

import time
from together import Together

client = Together()
job_id = "ft-12345678-abcd-1234-efgh-123456789012"

# Wait for completion
while True:
    job = client.fine_tuning.retrieve(job_id)
    if job.status == "completed":
        break
    elif job.status in ("failed", "cancelled"):
        raise RuntimeError(f"Job {job_id} ended with status: {job.status}")
    time.sleep(60)

# Download the final merged model
result = client.fine_tuning.download(job_id, checkpoint_type="merged")
print(f"Model downloaded: {result.filename} ({result.size} bytes)")

# Optionally also download best intermediate checkpoint
checkpoints = client.fine_tuning.list_checkpoints(job_id)
if len(checkpoints) > 1:
    best_cp = checkpoints[0]  # Most recent intermediate
    adapter_result = client.fine_tuning.download(
        best_cp.name,
        checkpoint_type="adapter",
        output="./checkpoints/",
    )
    print(f"Adapter checkpoint: {adapter_result.filename}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment