Implementation:Farama Foundation Gymnasium Walker2dEnv V5
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, MuJoCo_Environments |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Concrete implementation of the Walker2d v5 MuJoCo locomotion environment provided by Gymnasium.
Description
The Walker2d environment builds on the Hopper environment by adding another set of legs, allowing the robot to walk forward instead of hop. The walker is a two-dimensional bipedal robot consisting of seven main body parts: a single torso at the top, two thighs, two legs below the thighs, and two feet. The goal is to walk in the forward (right) direction by applying torque to the six hinges connecting the body parts. The reward function combines a healthy reward, a forward reward based on x-velocity, and a control cost penalty. The v5 version fixes unequal foot friction values (both feet now have friction 1.9) and the healthy_reward bug.
Usage
Use this environment for benchmarking bipedal locomotion RL algorithms. It provides a moderate difficulty challenge between the simpler Hopper and the more complex Humanoid environments. The v5 version includes individual reward terms in info and z_distance_from_origin tracking.
Code Reference
Source Location
- Repository: Farama_Foundation_Gymnasium
- File: gymnasium/envs/mujoco/walker2d_v5.py
Signature
class Walker2dEnv(MujocoEnv, utils.EzPickle):
def __init__(
self,
xml_file: str = "walker2d_v5.xml",
frame_skip: int = 4,
default_camera_config: dict[str, float | int] = DEFAULT_CAMERA_CONFIG,
forward_reward_weight: float = 1.0,
ctrl_cost_weight: float = 1e-3,
healthy_reward: float = 1.0,
terminate_when_unhealthy: bool = True,
healthy_z_range: tuple[float, float] = (0.8, 2.0),
healthy_angle_range: tuple[float, float] = (-1.0, 1.0),
reset_noise_scale: float = 5e-3,
exclude_current_positions_from_observation: bool = True,
**kwargs,
)
Import
import gymnasium as gym
env = gym.make("Walker2d-v5")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| action | np.ndarray (6,) | Yes | Torques applied to thigh, leg, foot, left thigh, left leg, and left foot joints, range [-1, 1] |
Outputs
| Name | Type | Description |
|---|---|---|
| observation | np.ndarray (17,) | State vector: qpos (8 elements, x excluded), qvel (9 elements, clipped to [-10, 10]) |
| reward | float | healthy_reward + forward_reward - ctrl_cost |
| terminated | bool | True if walker is unhealthy (z outside [0.8, 2.0] or angle outside [-1, 1]) |
| truncated | bool | Episode truncation (handled by TimeLimit wrapper) |
| info | dict | Contains x_position, z_distance_from_origin, x_velocity, reward_forward, reward_ctrl, reward_survive |
Usage Examples
import gymnasium as gym
env = gym.make("Walker2d-v5")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
observation, info = env.reset()
env.close()