Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Sgl project Sglang Generation Program Execution

From Leeroopedia
Revision as of 17:54, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Sgl_project_Sglang_Generation_Program_Execution.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Frontend_DSL, Execution, LLM_Programming
Last Updated 2026-02-10 00:00 GMT

Overview

An execution mechanism that runs SGLang generation programs against a backend, supporting single execution, batch execution, and streaming modes.

Description

Generation program execution takes a defined @sgl.function and runs it against the configured backend. The .run() method executes a single program instance with the given arguments and returns a ProgramState containing all generated variables and the full conversation text. The .run_batch() method executes multiple instances in parallel with thread pooling for throughput optimization. Both methods accept default sampling parameters that apply to all sgl.gen() calls within the program.

Usage

Use .run() for individual program execution and .run_batch() for processing multiple inputs efficiently. Streaming mode (stream=True) enables real-time output for interactive applications.

Theoretical Basis

Execution follows a compile-and-dispatch pattern:

  1. Compile: The @sgl.function body is traced to build an execution graph
  2. Dispatch: The graph is sent to the backend for execution
  3. Collect: Results (generated text, variables) are collected into ProgramState

Batch execution uses thread pools to maximize throughput:

  • Each program instance runs in its own thread
  • The backend handles request batching at the server level
  • Thread count is auto-tuned based on batch size (num_threads="auto")

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment