Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Togethercomputer Together python Batch Request Format

From Leeroopedia
Revision as of 13:55, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Togethercomputer_Together_python_Batch_Request_Format.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Attribute Value
Type Implementation (Pattern Doc)
Domains Batch_Processing, Inference, API_Client
Repository togethercomputer/together-python
Source src/together/types/batch.py:L26-33
Last Updated 2026-02-15 16:00 GMT

Overview

Pattern documentation for constructing JSONL input files for Together AI batch inference. Users construct these files manually; the SDK defines the supported endpoint values via the BatchEndpoint enum.

Code Reference

The BatchEndpoint enum defines the two supported target URLs for batch requests:

class BatchEndpoint(str, Enum):
    """
    The endpoint of a batch job
    """

    COMPLETIONS = "/v1/completions"
    CHAT_COMPLETIONS = "/v1/chat/completions"
    # More endpoints can be added here as needed

Source: src/together/types/batch.py:L26-33

Supported Endpoints

Enum Value Endpoint URL Request Body Type
BatchEndpoint.COMPLETIONS /v1/completions CompletionRequest (model + prompt)
BatchEndpoint.CHAT_COMPLETIONS /v1/chat/completions ChatCompletionRequest (model + messages)

I/O Contract

Input: A JSONL file where each line is a JSON object with the following fields:

Field Type Description
custom_id string Unique identifier for tracking this request in the output
method string HTTP method, always "POST"
url string One of /v1/completions or /v1/chat/completions
body object Request payload matching the target endpoint schema

Output: The file is uploaded to Together AI and used as input to a batch job.

Usage Examples

Chat Completions Endpoint

Example JSONL line for the /v1/chat/completions endpoint:

{"custom_id": "req-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", "messages": [{"role": "user", "content": "What is the capital of France?"}], "max_tokens": 256}}

Completions Endpoint

Example JSONL line for the /v1/completions endpoint:

{"custom_id": "req-002", "method": "POST", "url": "/v1/completions", "body": {"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", "prompt": "The capital of France is", "max_tokens": 64}}

Full Example: Creating a Batch Input File

import json

requests = [
    {
        "custom_id": "req-001",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
            "messages": [{"role": "user", "content": "What is the capital of France?"}],
            "max_tokens": 256,
        },
    },
    {
        "custom_id": "req-002",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
            "messages": [{"role": "user", "content": "Explain quantum computing briefly."}],
            "max_tokens": 512,
        },
    },
]

with open("batch_input.jsonl", "w") as f:
    for req in requests:
        f.write(json.dumps(req) + "\n")

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment