Implementation:Farama Foundation Gymnasium HopperEnv V4
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, MuJoCo_Environments |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Concrete implementation of the Hopper v4 MuJoCo locomotion environment provided by Gymnasium.
Description
The Hopper v4 environment implements the 2D one-legged hopper using MuJoCo bindings (mujoco >= 2.1.3). The hopper has four body parts: torso, thigh, leg, and foot. The goal is to hop forward by applying torques to three hinge joints. The reward combines forward velocity, healthy reward, and a control cost penalty. The episode terminates if the hopper becomes unhealthy. This is the legacy version; v5 is recommended for new projects. Note that v4 has a known issue where healthy_reward is given on every step even when unhealthy.
Usage
Use this environment for reproducing results from papers that used Hopper-v4 specifically. For new research, consider Hopper-v5 which fixes the healthy_reward bug and provides more detailed info dictionaries.
Code Reference
Source Location
- Repository: Farama_Foundation_Gymnasium
- File: gymnasium/envs/mujoco/hopper_v4.py
Signature
class HopperEnv(MujocoEnv, utils.EzPickle):
def __init__(
self,
forward_reward_weight=1.0,
ctrl_cost_weight=1e-3,
healthy_reward=1.0,
terminate_when_unhealthy=True,
healthy_state_range=(-100.0, 100.0),
healthy_z_range=(0.7, float("inf")),
healthy_angle_range=(-0.2, 0.2),
reset_noise_scale=5e-3,
exclude_current_positions_from_observation=True,
**kwargs,
)
Import
import gymnasium as gym
env = gym.make("Hopper-v4")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| action | np.ndarray (3,) | Yes | Torques applied to thigh, leg, and foot hinge joints, range [-1, 1] |
Outputs
| Name | Type | Description |
|---|---|---|
| observation | np.ndarray (11,) | State vector: qpos (5 elements, x excluded), qvel (6 elements, clipped to [-10, 10]) |
| reward | float | forward_reward + healthy_reward - ctrl_cost |
| terminated | bool | True if hopper is unhealthy |
| truncated | bool | Episode truncation (handled by TimeLimit wrapper) |
| info | dict | Contains x_position, x_velocity |
Usage Examples
import gymnasium as gym
env = gym.make("Hopper-v4")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
observation, info = env.reset()
env.close()