Implementation:LaurentMazare Tch rs GymEnv
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement Learning, Python Interop, Game AI |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Provides a Rust wrapper around the OpenAI Gym Python API using cpython, enabling reinforcement learning agents to interact with Gym environments through tch tensors.
Description
This module implements a Rust interface to OpenAI Gym environments via the cpython crate for Python interop. It consists of two main types:
Step<A> struct: Represents the result of taking an action in the environment. It contains:
obs: ATensorwith the observation after the action.action: The action that was taken (generic typeA).reward: Af64reward value.is_done: Aboolindicating if the episode has terminated.- A helper method
copy_with_obscreates a copy of the step with a different observation tensor.
GymEnv struct: Wraps a Python Gym environment object. Key methods:
new(name): Acquires the Python GIL, imports thegymmodule, callsgym.make(name), seeds it with 42, and extracts the action space size and observation space shape. Supports both discrete (via.n) and continuous (via.shape) action spaces.reset(): Resets the environment and returns the initial observation as aTensorby extracting aVec<f32>from the Python object.step(action): Applies an action (any type implementingToPyObject + Copy), extracts the observation, reward, and done flag from the Python tuple, and returns aStep<A>.action_space(): Returns the number of allowed actions.observation_space(): Returns the shape of observation tensors.
Usage
Use this wrapper when building reinforcement learning agents in Rust that need to interact with OpenAI Gym environments. It bridges the gap between Python-based environments and Rust-based tch tensor computations.
Code Reference
Source Location
- Repository: LaurentMazare_Tch_rs
- File: examples/reinforcement-learning/gym_env.rs
- Lines: 1-78
Signature
pub struct Step<A> {
pub obs: Tensor,
pub action: A,
pub reward: f64,
pub is_done: bool,
}
impl<A: Copy> Step<A> {
pub fn copy_with_obs(&self, obs: &Tensor) -> Step<A>
}
pub struct GymEnv {
env: PyObject,
action_space: i64,
observation_space: Vec<i64>,
}
impl GymEnv {
pub fn new(name: &str) -> PyResult<GymEnv>
pub fn reset(&self) -> PyResult<Tensor>
pub fn step<A: ToPyObject + Copy>(&self, action: A) -> PyResult<Step<A>>
pub fn action_space(&self) -> i64
pub fn observation_space(&self) -> &[i64]
}
Import
// Module within the reinforcement-learning example.
use cpython::{NoArgs, ObjectProtocol, PyObject, PyResult, Python, ToPyObject};
use tch::Tensor;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| name | &str | Yes | OpenAI Gym environment name (e.g., "CartPole-v0", "SpaceInvadersNoFrameskip-v4"). |
| action | A: ToPyObject + Copy | Yes (for step) | The action to take in the environment (e.g., i64 for discrete action spaces). |
Outputs
| Name | Type | Description |
|---|---|---|
| GymEnv | struct | An initialized Gym environment wrapper ready for interaction. |
| Tensor | tch::Tensor | Observation tensor from reset or step, created from Vec<f32>. |
| Step<A> | struct | Contains observation tensor, action, reward (f64), and done flag (bool). |
| action_space | i64 | Number of available actions in the environment. |
| observation_space | &[i64] | Shape of the observation tensor. |
Usage Examples
use cpython::{NoArgs, ObjectProtocol, PyObject, PyResult, Python, ToPyObject};
use tch::Tensor;
// Create a CartPole environment
let env = GymEnv::new("CartPole-v0")?;
println!("action space: {:?}", env.action_space());
println!("observation space: {:?}", env.observation_space());
// Reset and get initial observation
let mut obs = env.reset()?;
// Run an episode
loop {
let action = 0i64; // or choose action from policy
let step = env.step(action)?;
obs = step.obs;
println!("reward: {}, done: {}", step.reward, step.is_done);
if step.is_done {
obs = env.reset()?;
break;
}
}