Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:OpenHands OpenHands Sandbox Command Execution

From Leeroopedia
Knowledge Sources
Domains Cloud_Infrastructure, Runtime_Management
Last Updated 2026-02-11 21:00 GMT

Overview

Sandbox Command Execution is the principle of executing shell commands and code inside a cloud sandbox environment, where agent actions are serialized, transmitted to the sandbox, and results are returned as observations.

Description

The core purpose of a cloud sandbox is to execute agent-generated commands in an isolated environment. OpenHands uses the command pattern to decouple action generation (by the agent) from action execution (inside the sandbox). Each agent action (e.g., run a shell command, execute Python code, edit a file) is represented as a serializable action object. The runtime transmits this action to the sandbox, where it is executed, and the result is captured as an observation object returned to the agent.

There are two distinct execution strategies across the four providers:

Strategy 1: Action Server HTTP Dispatch (Daytona, Modal, Runloop)

These three runtimes use the action execution server running inside the sandbox. The runtime serializes the action object into JSON, sends it as an HTTP POST request to the server's /execute_action endpoint, and deserializes the response into an observation. This approach inherits the methods from ActionExecutionClient, which provides run() and run_ipython() methods that handle the HTTP communication.

Strategy 2: Direct SDK Execution (E2B)

E2B bypasses the action server entirely. Instead, it uses the E2B SDK's native execution API through the E2BBox wrapper class. Shell commands are executed via E2BBox.execute(), which calls the E2B sandbox's process API directly and returns the exit code and output. IPython code is executed via E2BRuntime.run_ipython(), which writes code to a temporary file and runs it through the sandbox's IPython kernel.

Both strategies produce the same result from the agent's perspective: an observation object containing the command output, exit code, and any error information.

Usage

Sandbox Command Execution is the primary runtime operation. It is invoked every time the agent produces an action that requires execution in the sandbox, which occurs repeatedly throughout an agent session. The orchestrator calls runtime.run(action) or runtime.run_ipython(action) without needing to know which execution strategy is used.

Theoretical Basis

The command pattern separates the requester of an action from its executor. Actions are first-class objects that can be serialized, transmitted, and executed remotely.

COMMAND PATTERN FLOW:
    Agent -> Action Object -> Runtime -> Sandbox -> Observation Object -> Agent

STRATEGY 1: HTTP Dispatch (Daytona, Modal, Runloop)
    Agent creates CmdRunAction("ls -la")
    Runtime serializes action to JSON
    Runtime sends HTTP POST /execute_action with JSON body
    Action server inside sandbox deserializes action
    Action server executes "ls -la" in sandbox shell
    Action server serializes CmdOutputObservation(output, exit_code)
    Runtime receives HTTP response and deserializes observation
    Agent receives CmdOutputObservation

STRATEGY 2: Direct SDK Execution (E2B)
    Agent creates CmdRunAction("ls -la")
    Runtime calls E2BBox.execute("ls -la", timeout)
    E2BBox calls e2b_sandbox.process.start("ls -la")
    E2BBox waits for process completion
    E2BBox returns (exit_code, stdout + stderr)
    Runtime wraps result in CmdOutputObservation
    Agent receives CmdOutputObservation

COMMON CONTRACT:
    INPUT:  Action object (CmdRunAction, IPythonRunCellAction, ...)
    OUTPUT: Observation object (CmdOutputObservation, IPythonRunCellObservation, ...)
    INVARIANT: exit_code == 0 indicates success
    INVARIANT: output contains combined stdout/stderr

The key design insight is that the two strategies are interchangeable at the runtime interface level. The agent and orchestrator never need to distinguish between HTTP-dispatched and SDK-executed commands.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment