Workflow:Facebookresearch Habitat lab HITL Interactive Evaluation

Knowledge Sources	Habitat-Lab Habitat Docs Habitat 3.0
Domains	Embodied_AI, Human_in_the_Loop, Interactive_Evaluation
Last Updated	2026-02-15 02:00 GMT

Overview

End-to-end process for running human-in-the-loop (HITL) interactive experiments where humans collaborate with or evaluate trained embodied agents in real time using the habitat-hitl framework.

Description

This workflow covers setting up and running interactive HITL applications where human participants control avatars (or observe agents) within the Habitat simulation. The framework supports both local desktop GUI mode and headless server mode with WebSocket-based remote clients. Applications are built using an app-state-machine pattern where each state (lobby, tutorial, rearrangement, feedback) manages a distinct phase of the experiment. Session data including human actions, agent trajectories, and task metrics are recorded for analysis. This is the primary workflow for Habitat 3.0 human evaluation studies.

Usage

Execute this workflow when you need human participants to interact with trained agents in simulated environments for evaluation, data collection, or collaborative task execution. This is used for Habitat 3.0 social rearrangement studies and human preference evaluations.

Execution Steps

Step 1: Application Configuration

Compose a Hydra configuration combining the HITL framework defaults with application-specific settings. Configure the simulation environment, agent embodiments, HITL window/networking parameters, and experiment design (episode selection, number of participants, session structure).

Key considerations:

HITL defaults configure window size, target simulation rate, and networking parameters
Choose between headed mode (local GUI window) or headless mode (remote WebSocket server)
Application configs define which app states to use and their transitions
Multi-user experiments require the networking process and client management configuration

Step 2: Environment and Agent Setup

Initialize the Habitat environment with the configured task, dataset, and agents. Load pre-trained policy checkpoints for AI-controlled agents. Configure the GUI controller for human-controlled agents with appropriate input mappings (keyboard/mouse for desktop, VR controllers for XR).

Key considerations:

AI agents use the BaselinesController wrapper to generate actions from trained policies
Human-controlled agents use the GuiController for translating input to Habitat actions
Camera helpers manage first-person and third-person viewpoints
Avatar switcher allows changing humanoid appearance at runtime

Step 3: Application State Machine Initialization

Initialize the application state machine with the sequence of states: start screen, lobby (waiting for participants), tutorial (optional animated walkthrough), main task state (rearrangement gameplay), feedback collection, and session end. Each state defines its own update logic, UI overlays, and transition conditions.

Key considerations:

The state machine manages the experiment lifecycle from start to finish
AppService provides dependency injection of framework services to each state
The lobby state waits for the required number of remote clients before proceeding
Tutorial states provide animated camera sequences showing the task environment

Step 4: Interactive Session Execution

Run the main simulation loop where the HITL driver alternates between processing user input, stepping the simulation, rendering frames, and sending updates to remote clients. The simulation runs at a configurable target frame rate. Human actions and agent observations are processed each frame.

Key considerations:

The LabDriver manages the full Habitat environment lifecycle
The SimDriver is a lightweight alternative for sim-only applications
Navigation helpers provide click-to-navigate and pathfinding visualization
Pick/place helpers manage object interaction mechanics

Step 5: Data Recording and Upload

Record session data including episode outcomes, human actions, agent trajectories, task metrics, and optional video. Data is serialized to JSON and gfx-replay keyframe formats. For cloud deployments, session data can be uploaded to S3 with signed URLs for retrieval.

Key considerations:

SessionRecorder captures per-frame action and state data
GfxReplay keyframes enable replaying the 3D visualization offline
Metrics helper wraps access to environment measurements for task progress tracking
S3 upload utilities handle asynchronous data persistence for remote experiments

Execution Diagram

GitHub URL

Workflow Repository