Implementation:Togethercomputer Together python Endpoints CLI

Knowledge Sources	Together Python
Domains	CLI, Infrastructure
Last Updated	2026-02-15 16:00 GMT

Overview

Concrete CLI tool for managing dedicated inference endpoints from the command line provided by the Together Python SDK.

Description

The endpoints Click command group provides terminal commands for creating, listing, starting, stopping, updating, and deleting dedicated endpoints. It includes hardware listing with formatted table output, availability zone queries, and wait-for-ready functionality. Commands output endpoint IDs to stdout for scripting and status/progress to stderr.

Usage

Use these CLI commands when managing endpoints from a terminal or shell script rather than Python code. The CLI wraps the Endpoints resource API with user-friendly options and formatted output.

Code Reference

Source Location

Repository: Together Python
File: src/together/cli/api/endpoints.py
Lines: 1-552

Signature

# CLI Commands
together endpoints create --model MODEL --gpu GPU_TYPE [--gpu-count N] [--min-replicas N] [--max-replicas N] [--display-name NAME] [--wait/--no-wait]
together endpoints get ENDPOINT_ID [--json]
together endpoints list [--type dedicated|serverless] [--mine] [--json]
together endpoints start ENDPOINT_ID [--wait/--no-wait]
together endpoints stop ENDPOINT_ID [--wait/--no-wait]
together endpoints update ENDPOINT_ID [--min-replicas N] [--max-replicas N] [--display-name NAME] [--inactive-timeout N]
together endpoints delete ENDPOINT_ID
together endpoints hardware [--model MODEL] [--json] [--available]
together endpoints availability-zones [--json]

Import

# Invoked via CLI entry point
together endpoints <subcommand>

I/O Contract

Inputs

Name	Type	Required	Description
--model	str	Yes (create)	Model to deploy (e.g., meta-llama/Llama-4-Scout-17B-16E-Instruct)
--gpu	Choice	Yes (create)	GPU type: b200, h200, h100, a100, l40, l40s, rtx-6000
--gpu-count	int	No	Number of GPUs per replica (default: 1)
ENDPOINT_ID	str	Yes (get/start/stop/delete/update)	Endpoint identifier
--wait/--no-wait	bool	No	Wait for state transition to complete (default: wait)

Outputs

Name	Type	Description
stdout	str	Endpoint ID (for scripting) or formatted table/JSON
stderr	str	Status messages and progress indicators

Usage Examples

Create and Manage Endpoints via CLI

# Create a dedicated endpoint
ENDPOINT_ID=$(together endpoints create --model meta-llama/Llama-4-Scout-17B-16E-Instruct --gpu h100 --gpu-count 1)

# List your endpoints
together endpoints list --type dedicated --mine

# Check endpoint details
together endpoints get $ENDPOINT_ID --json

# Stop the endpoint
together endpoints stop $ENDPOINT_ID

# List available hardware for a model
together endpoints hardware --model meta-llama/Llama-4-Scout-17B-16E-Instruct --available

# List availability zones
together endpoints availability-zones

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment