Implementation:Farama Foundation Gymnasium SwimmerEnv V4
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, MuJoCo_Environments |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Concrete implementation of the Swimmer v4 MuJoCo locomotion environment provided by Gymnasium.
Description
The Swimmer v4 environment implements a multi-segment swimmer in a 2D pool using MuJoCo bindings (mujoco >= 2.1.3). The default swimmer consists of three links and two rotor joints. The swimmer is suspended in a pool and the goal is to move as fast as possible towards the right by applying torque to the rotors and using fluid friction. The reward is forward_reward - ctrl_cost. The Swimmer never terminates; episodes end through truncation. The observation includes body angles (3 or 5 elements depending on exclude_current_positions_from_observation) and velocities (5 elements).
Usage
Use this environment for reproducing results from papers that used Swimmer-v4. For new research, consider Swimmer-v5 which adds support for custom XML models, configurable frame_skip, observation_structure, and consistent reward naming.
Code Reference
Source Location
- Repository: Farama_Foundation_Gymnasium
- File: gymnasium/envs/mujoco/swimmer_v4.py
Signature
class SwimmerEnv(MujocoEnv, utils.EzPickle):
def __init__(
self,
forward_reward_weight=1.0,
ctrl_cost_weight=1e-4,
reset_noise_scale=0.1,
exclude_current_positions_from_observation=True,
**kwargs,
)
Import
import gymnasium as gym
env = gym.make("Swimmer-v4")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| action | np.ndarray (2,) | Yes | Torques applied to the two rotor joints, range [-1, 1] |
Outputs
| Name | Type | Description |
|---|---|---|
| observation | np.ndarray (8,) | State vector: qpos (3 angles, x/y excluded by default), qvel (5 velocities) |
| reward | float | forward_reward - ctrl_cost |
| terminated | bool | Always False (Swimmer never terminates) |
| truncated | bool | Episode truncation (handled by TimeLimit wrapper) |
| info | dict | Contains reward_fwd, reward_ctrl, x_position, y_position, distance_from_origin, x_velocity, y_velocity, forward_reward |
Usage Examples
import gymnasium as gym
env = gym.make("Swimmer-v4")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
observation, info = env.reset()
env.close()