Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Farama Foundation Gymnasium JaxToTorch

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Wrappers
Last Updated 2026-02-15 03:00 GMT

Overview

A convenience wrapper that converts a JAX-based Gymnasium environment so that it can be interacted with using PyTorch tensors.

Description

The JaxToTorch wrapper is a thin subclass of ArrayConversion that pre-configures the conversion from JAX arrays to PyTorch tensors. Actions provided as PyTorch tensors are automatically converted to JAX arrays for the underlying environment, and observations are returned as PyTorch tensors.

The module also provides two utility functions:

  • jax_to_torch -- A partial application of array_conversion targeting the PyTorch namespace.
  • torch_to_jax -- A partial application of array_conversion targeting the JAX namespace.

The wrapper accepts an optional device parameter to specify which PyTorch device the output tensors should be placed on.

Note: Rendered frames are returned as NumPy arrays, not PyTorch tensors.

Requires both jax and torch packages.

Usage

Use this wrapper when you have a JAX-based environment but your training framework uses PyTorch. This enables seamless integration between JAX environments and PyTorch-based agents.

Code Reference

Source Location

Signature

class JaxToTorch(ArrayConversion):
    def __init__(self, env: gym.Env, device: Device | None = None): ...

Import

from gymnasium.wrappers import JaxToTorch
from gymnasium.wrappers.jax_to_torch import jax_to_torch, torch_to_jax

I/O Contract

Inputs

Name Type Required Description
env Env Yes The JAX-based environment to wrap
device str or torch.device or None No The device for output PyTorch tensors (default None, uses default device)

Outputs

Name Type Description
observation torch.Tensor Observation converted to PyTorch tensor
reward float Reward as a Python float
terminated bool Termination flag as a Python bool
truncated bool Truncation flag as a Python bool
info dict Info dict with values converted to PyTorch tensors

Usage Examples

import torch
import gymnasium as gym
from gymnasium.wrappers import JaxToTorch

env = gym.make("JaxEnv-vx")
env = JaxToTorch(env, device="cuda:0")
obs, _ = env.reset(seed=123)
type(obs)  # <class 'torch.Tensor'>

action = torch.tensor(env.action_space.sample())
obs, reward, terminated, truncated, info = env.step(action)
type(obs)     # <class 'torch.Tensor'>
type(reward)  # <class 'float'>

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment