Principle:Togethercomputer Together python Batch Result Retrieval

Attribute	Value
Type	Principle
Domains	Batch_Processing, Inference, API_Client
Repository	togethercomputer/together-python
Last Updated	2026-02-15 16:00 GMT

Overview

Mechanism for downloading batch inference results from Together AI after job completion.

Description

Result retrieval downloads the output file produced by a completed batch job. The output file is in JSONL format with one result per line, matching the custom_id from the input file. This uses the Files.retrieve_content() API with the output_file_id from the completed BatchJob object.

The retrieval process:

Obtain the output_file_id from a BatchJob object that has reached COMPLETED status.
Call Files.retrieve_content() with that file ID and an optional local output path.
The SDK downloads the file using a streaming download manager with progress tracking.
A FileObject is returned containing the local file path and size.

Each line in the output JSONL file corresponds to one request from the input file, identified by its custom_id, and contains the inference response.

Usage

Use this principle after a batch job reaches COMPLETED status. The workflow is:

Poll the batch job until status == "COMPLETED".
Read the output_file_id from the completed BatchJob object.
Call Files.retrieve_content(id=output_file_id, output="results.jsonl") to download.
Parse the downloaded JSONL file to extract individual inference results.

If the batch job also has an error_file_id, this can be downloaded separately using the same Files.retrieve_content() method to inspect per-request errors.

Theoretical Basis

Result retrieval completes the batch inference lifecycle:

Prepare -- Construct the JSONL input file.
Upload -- Transfer the input file to server storage.
Submit -- Create the batch job referencing the uploaded file.
Monitor -- Poll for job completion.
Retrieve -- Download the output file containing results.

The output file uses the same JSONL format as the input, providing a one-to-one mapping between input requests and output results via the custom_id field. This enables straightforward correlation of results with their originating requests.

The download uses streaming transfer with a DownloadManager to handle potentially large result files efficiently, avoiding loading the entire file into memory before writing to disk.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment