Implementation:Togethercomputer Together python Endpoints CLI
| Knowledge Sources | |
|---|---|
| Domains | CLI, Infrastructure |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
Concrete CLI tool for managing dedicated inference endpoints from the command line provided by the Together Python SDK.
Description
The endpoints Click command group provides terminal commands for creating, listing, starting, stopping, updating, and deleting dedicated endpoints. It includes hardware listing with formatted table output, availability zone queries, and wait-for-ready functionality. Commands output endpoint IDs to stdout for scripting and status/progress to stderr.
Usage
Use these CLI commands when managing endpoints from a terminal or shell script rather than Python code. The CLI wraps the Endpoints resource API with user-friendly options and formatted output.
Code Reference
Source Location
- Repository: Together Python
- File: src/together/cli/api/endpoints.py
- Lines: 1-552
Signature
# CLI Commands
together endpoints create --model MODEL --gpu GPU_TYPE [--gpu-count N] [--min-replicas N] [--max-replicas N] [--display-name NAME] [--wait/--no-wait]
together endpoints get ENDPOINT_ID [--json]
together endpoints list [--type dedicated|serverless] [--mine] [--json]
together endpoints start ENDPOINT_ID [--wait/--no-wait]
together endpoints stop ENDPOINT_ID [--wait/--no-wait]
together endpoints update ENDPOINT_ID [--min-replicas N] [--max-replicas N] [--display-name NAME] [--inactive-timeout N]
together endpoints delete ENDPOINT_ID
together endpoints hardware [--model MODEL] [--json] [--available]
together endpoints availability-zones [--json]
Import
# Invoked via CLI entry point
together endpoints <subcommand>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --model | str | Yes (create) | Model to deploy (e.g., meta-llama/Llama-4-Scout-17B-16E-Instruct) |
| --gpu | Choice | Yes (create) | GPU type: b200, h200, h100, a100, l40, l40s, rtx-6000 |
| --gpu-count | int | No | Number of GPUs per replica (default: 1) |
| ENDPOINT_ID | str | Yes (get/start/stop/delete/update) | Endpoint identifier |
| --wait/--no-wait | bool | No | Wait for state transition to complete (default: wait) |
Outputs
| Name | Type | Description |
|---|---|---|
| stdout | str | Endpoint ID (for scripting) or formatted table/JSON |
| stderr | str | Status messages and progress indicators |
Usage Examples
Create and Manage Endpoints via CLI
# Create a dedicated endpoint
ENDPOINT_ID=$(together endpoints create --model meta-llama/Llama-4-Scout-17B-16E-Instruct --gpu h100 --gpu-count 1)
# List your endpoints
together endpoints list --type dedicated --mine
# Check endpoint details
together endpoints get $ENDPOINT_ID --json
# Stop the endpoint
together endpoints stop $ENDPOINT_ID
# List available hardware for a model
together endpoints hardware --model meta-llama/Llama-4-Scout-17B-16E-Instruct --available
# List availability zones
together endpoints availability-zones