Implementation:Togethercomputer Together python Batch Request Format
| Attribute | Value |
|---|---|
| Type | Implementation (Pattern Doc) |
| Domains | Batch_Processing, Inference, API_Client |
| Repository | togethercomputer/together-python |
| Source | src/together/types/batch.py:L26-33 |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
Pattern documentation for constructing JSONL input files for Together AI batch inference. Users construct these files manually; the SDK defines the supported endpoint values via the BatchEndpoint enum.
Code Reference
The BatchEndpoint enum defines the two supported target URLs for batch requests:
class BatchEndpoint(str, Enum):
"""
The endpoint of a batch job
"""
COMPLETIONS = "/v1/completions"
CHAT_COMPLETIONS = "/v1/chat/completions"
# More endpoints can be added here as needed
Source: src/together/types/batch.py:L26-33
Supported Endpoints
| Enum Value | Endpoint URL | Request Body Type |
|---|---|---|
BatchEndpoint.COMPLETIONS |
/v1/completions |
CompletionRequest (model + prompt) |
BatchEndpoint.CHAT_COMPLETIONS |
/v1/chat/completions |
ChatCompletionRequest (model + messages) |
I/O Contract
Input: A JSONL file where each line is a JSON object with the following fields:
| Field | Type | Description |
|---|---|---|
custom_id |
string | Unique identifier for tracking this request in the output |
method |
string | HTTP method, always "POST"
|
url |
string | One of /v1/completions or /v1/chat/completions
|
body |
object | Request payload matching the target endpoint schema |
Output: The file is uploaded to Together AI and used as input to a batch job.
Usage Examples
Chat Completions Endpoint
Example JSONL line for the /v1/chat/completions endpoint:
{"custom_id": "req-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", "messages": [{"role": "user", "content": "What is the capital of France?"}], "max_tokens": 256}}
Completions Endpoint
Example JSONL line for the /v1/completions endpoint:
{"custom_id": "req-002", "method": "POST", "url": "/v1/completions", "body": {"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", "prompt": "The capital of France is", "max_tokens": 64}}
Full Example: Creating a Batch Input File
import json
requests = [
{
"custom_id": "req-001",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
"messages": [{"role": "user", "content": "What is the capital of France?"}],
"max_tokens": 256,
},
},
{
"custom_id": "req-002",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
"messages": [{"role": "user", "content": "Explain quantum computing briefly."}],
"max_tokens": 512,
},
},
]
with open("batch_input.jsonl", "w") as f:
for req in requests:
f.write(json.dumps(req) + "\n")