Implementation:Farama Foundation Gymnasium NumpyToTorch

Knowledge Sources	Farama_Foundation_Gymnasium Gymnasium Docs
Domains	Reinforcement_Learning, Wrappers
Last Updated	2026-02-15 03:00 GMT

Overview

A convenience wrapper that converts a NumPy-based Gymnasium environment so that it can be interacted with using PyTorch tensors.

Description

The NumpyToTorch wrapper is a thin subclass of ArrayConversion that pre-configures the conversion from NumPy arrays to PyTorch tensors. Actions provided as PyTorch tensors are automatically converted to NumPy arrays for the underlying environment, and observations are returned as PyTorch tensors.

The module also provides two utility functions:

torch_to_numpy -- A partial application of array_conversion targeting the NumPy namespace.
numpy_to_torch -- A partial application of array_conversion targeting the PyTorch namespace.

The wrapper accepts an optional device parameter to specify which PyTorch device the output tensors should be placed on.

Note: Rendered frames are returned as NumPy arrays, not PyTorch tensors.

Requires the torch package.

Usage

Use this wrapper when your environment returns NumPy arrays but your agent or training pipeline operates on PyTorch tensors. This is the most common conversion wrapper, as most Gymnasium environments use NumPy and many modern RL frameworks use PyTorch.

Code Reference

Source Location

Repository: Farama_Foundation_Gymnasium
File: gymnasium/wrappers/numpy_to_torch.py

Signature

class NumpyToTorch(ArrayConversion):
    def __init__(self, env: gym.Env, device: Device | None = None): ...

Import

from gymnasium.wrappers import NumpyToTorch
from gymnasium.wrappers.numpy_to_torch import torch_to_numpy, numpy_to_torch

I/O Contract

Inputs

Name	Type	Required	Description
env	Env	Yes	The NumPy-based environment to wrap
device	str or torch.device or None	No	The device for output PyTorch tensors (default None, uses default device)

Outputs

Name	Type	Description
observation	torch.Tensor	Observation converted to PyTorch tensor
reward	float	Reward as a Python float
terminated	bool	Termination flag as a Python bool
truncated	bool	Truncation flag as a Python bool
info	dict	Info dict with values converted to PyTorch tensors

Usage Examples

import torch
import gymnasium as gym
from gymnasium.wrappers import NumpyToTorch

env = gym.make("CartPole-v1")
env = NumpyToTorch(env)
obs, _ = env.reset(seed=123)
type(obs)  # <class 'torch.Tensor'>

action = torch.tensor(env.action_space.sample())
obs, reward, terminated, truncated, info = env.step(action)
type(obs)     # <class 'torch.Tensor'>
type(reward)  # <class 'float'>

# With specific device
env = gym.make("CartPole-v1")
env = NumpyToTorch(env, device="cuda:0")

Related Pages

Environment:Farama_Foundation_Gymnasium_Python_3_10_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment