Workflow:Facebookresearch Habitat lab HITL Interactive Evaluation
| Knowledge Sources | |
|---|---|
| Domains | Embodied_AI, Human_in_the_Loop, Interactive_Evaluation |
| Last Updated | 2026-02-15 02:00 GMT |
Overview
End-to-end process for running human-in-the-loop (HITL) interactive experiments where humans collaborate with or evaluate trained embodied agents in real time using the habitat-hitl framework.
Description
This workflow covers setting up and running interactive HITL applications where human participants control avatars (or observe agents) within the Habitat simulation. The framework supports both local desktop GUI mode and headless server mode with WebSocket-based remote clients. Applications are built using an app-state-machine pattern where each state (lobby, tutorial, rearrangement, feedback) manages a distinct phase of the experiment. Session data including human actions, agent trajectories, and task metrics are recorded for analysis. This is the primary workflow for Habitat 3.0 human evaluation studies.
Usage
Execute this workflow when you need human participants to interact with trained agents in simulated environments for evaluation, data collection, or collaborative task execution. This is used for Habitat 3.0 social rearrangement studies and human preference evaluations.
Execution Steps
Step 1: Application Configuration
Compose a Hydra configuration combining the HITL framework defaults with application-specific settings. Configure the simulation environment, agent embodiments, HITL window/networking parameters, and experiment design (episode selection, number of participants, session structure).
Key considerations:
- HITL defaults configure window size, target simulation rate, and networking parameters
- Choose between headed mode (local GUI window) or headless mode (remote WebSocket server)
- Application configs define which app states to use and their transitions
- Multi-user experiments require the networking process and client management configuration
Step 2: Environment and Agent Setup
Initialize the Habitat environment with the configured task, dataset, and agents. Load pre-trained policy checkpoints for AI-controlled agents. Configure the GUI controller for human-controlled agents with appropriate input mappings (keyboard/mouse for desktop, VR controllers for XR).
Key considerations:
- AI agents use the BaselinesController wrapper to generate actions from trained policies
- Human-controlled agents use the GuiController for translating input to Habitat actions
- Camera helpers manage first-person and third-person viewpoints
- Avatar switcher allows changing humanoid appearance at runtime
Step 3: Application State Machine Initialization
Initialize the application state machine with the sequence of states: start screen, lobby (waiting for participants), tutorial (optional animated walkthrough), main task state (rearrangement gameplay), feedback collection, and session end. Each state defines its own update logic, UI overlays, and transition conditions.
Key considerations:
- The state machine manages the experiment lifecycle from start to finish
- AppService provides dependency injection of framework services to each state
- The lobby state waits for the required number of remote clients before proceeding
- Tutorial states provide animated camera sequences showing the task environment
Step 4: Interactive Session Execution
Run the main simulation loop where the HITL driver alternates between processing user input, stepping the simulation, rendering frames, and sending updates to remote clients. The simulation runs at a configurable target frame rate. Human actions and agent observations are processed each frame.
Key considerations:
- The LabDriver manages the full Habitat environment lifecycle
- The SimDriver is a lightweight alternative for sim-only applications
- Navigation helpers provide click-to-navigate and pathfinding visualization
- Pick/place helpers manage object interaction mechanics
Step 5: Data Recording and Upload
Record session data including episode outcomes, human actions, agent trajectories, task metrics, and optional video. Data is serialized to JSON and gfx-replay keyframe formats. For cloud deployments, session data can be uploaded to S3 with signed URLs for retrieval.
Key considerations:
- SessionRecorder captures per-frame action and state data
- GfxReplay keyframes enable replaying the 3D visualization offline
- Metrics helper wraps access to environment measurements for task progress tracking
- S3 upload utilities handle asynchronous data persistence for remote experiments