Implementation:Google deepmind Dm control Suite Fish
| Metadata | Value |
|---|---|
| Implementation | Suite Fish |
| Domain | Reinforcement_Learning, Control |
| Source | Google_deepmind_Dm_control |
| Last Updated | 2026-02-15 04:00 GMT |
Overview
Concrete tool for controlling a 3D fish to remain upright or swim to a target provided by the dm_control Control Suite.
Description
The Fish domain simulates a 3D fish with a segmented tail and two fins operating in a fluid-like environment. The fish has seven actuated joints: two tail segments, a tail twist, and roll/pitch controls for the left and right fins. The Physics subclass provides methods for reading the torso uprightness (z-axis projection), torso velocity from sensors, joint angles and velocities, and the vector from the fish's mouth to a target position in the mouth's local coordinate frame.
Two benchmarking tasks are registered: upright and swim. The Upright task initializes the fish with a random orientation (random quaternion) and random joint angles in [-0.2, 0.2], hides the target geometry, and rewards the agent with a smooth tolerance-based signal for keeping the torso z-axis aligned with the world z-axis. The Swim task additionally randomizes a target position in the workspace and provides a composite reward that weighs proximity to the target (7/8) and uprightness (1/8).
Both tasks use a default time limit of 40 seconds and a control timestep of 0.04 seconds.
Usage
Use this implementation for 3D locomotion and orientation control benchmarks in a fluid environment. Load via suite.load(domain_name='fish', task_name='upright') or suite.load(domain_name='fish', task_name='swim').
Code Reference
Source Location
- Repository: Google_deepmind_Dm_control
- File: dm_control/suite/fish.py
- Lines: 1-172
Signature
# Task factory functions
def upright(time_limit=40, random=None, environment_kwargs=None)
def swim(time_limit=40, random=None, environment_kwargs=None)
# Physics subclass
class Physics(mujoco.Physics):
def upright(self) # z-axis projection of torso to world z
def torso_velocity(self) # velocities and angular velocities from sensors
def joint_velocities(self) # velocities of the 7 actuated joints
def joint_angles(self) # positions of the 7 actuated joints
def mouth_to_target(self) # vector from mouth to target in mouth frame
# Task classes
class Upright(base.Task):
def initialize_episode(self, physics)
def get_observation(self, physics)
def get_reward(self, physics)
class Swim(base.Task):
def initialize_episode(self, physics)
def get_observation(self, physics)
def get_reward(self, physics)
Import
from dm_control import suite
env = suite.load(domain_name='fish', task_name='swim')
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
time_limit |
float | No | Maximum episode duration in seconds (default 40). |
random |
int, numpy.random.RandomState, or None | No | Random seed or RNG instance for reproducibility. |
environment_kwargs |
dict or None | No | Additional keyword arguments forwarded to the Environment constructor.
|
Outputs
| Name | Type | Description |
|---|---|---|
| environment | dm_control.rl.control.Environment |
A fully initialised environment conforming to the dm_env.Environment interface.
|
Observations
| Key | Type | Description |
|---|---|---|
joint_angles |
numpy array (7,) | Positions of the 7 actuated joints. |
upright |
float | Projection from torso z-axis to world z-axis. |
velocity |
numpy array | Joint velocities. |
target |
numpy array (3,) | Vector from mouth to target in mouth local frame (Swim task only). |
Usage Examples
from dm_control import suite
# Load the upright task
env = suite.load(domain_name='fish', task_name='upright')
# Run an episode
time_step = env.reset()
while not time_step.last():
action = env.action_spec().generate_value()
time_step = env.step(action)
# Load the swim task
env_swim = suite.load(domain_name='fish', task_name='swim')