Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Farama Foundation Gymnasium HalfCheetahEnv V5

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, MuJoCo_Environments
Last Updated 2026-02-15 03:00 GMT

Overview

Concrete implementation of the HalfCheetah v5 MuJoCo locomotion environment provided by Gymnasium.

Description

The HalfCheetah v5 environment is based on P. Wawrzynski's work in "A Cat-Like Robot Real-Time Learning to Run". It is a 2-dimensional robot consisting of 9 body parts and 8 joints. The goal is to apply torque to 6 joints to make the cheetah run forward as fast as possible. The reward is forward_reward - ctrl_cost, where forward_reward is proportional to x-velocity and ctrl_cost penalizes large actions. The HalfCheetah never terminates; episodes only end through truncation. The v5 version adds support for custom XML files, configurable frame_skip, observation_structure, and non-empty reset info.

Usage

Use this environment for benchmarking continuous control RL algorithms on a standard locomotion task. The HalfCheetah is one of the most commonly used MuJoCo benchmarks. The v5 version is recommended for new research, providing consistent reward naming (reward_forward instead of reward_run) and better configurability.

Code Reference

Source Location

Signature

class HalfCheetahEnv(MujocoEnv, utils.EzPickle):
    def __init__(
        self,
        xml_file: str = "half_cheetah.xml",
        frame_skip: int = 5,
        default_camera_config: dict[str, float | int] = DEFAULT_CAMERA_CONFIG,
        forward_reward_weight: float = 1.0,
        ctrl_cost_weight: float = 0.1,
        reset_noise_scale: float = 0.1,
        exclude_current_positions_from_observation: bool = True,
        **kwargs,
    )

Import

import gymnasium as gym
env = gym.make("HalfCheetah-v5")

I/O Contract

Inputs

Name Type Required Description
action np.ndarray (6,) Yes Torques applied to back thigh, back shin, back foot, front thigh, front shin, front foot, range [-1, 1]

Outputs

Name Type Description
observation np.ndarray (17,) State vector: qpos (8 elements, x excluded by default), qvel (9 elements)
reward float forward_reward - ctrl_cost
terminated bool Always False (HalfCheetah never terminates)
truncated bool Episode truncation (handled by TimeLimit wrapper)
info dict Contains x_position, x_velocity, reward_forward, reward_ctrl

Usage Examples

import gymnasium as gym

env = gym.make("HalfCheetah-v5")
observation, info = env.reset(seed=42)

for _ in range(1000):
    action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        observation, info = env.reset()

env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment