Module environments

The environments module includes all necessary functionality to spawn and wrap environments.

The module atari_wrappers is a modified copy from the torchbeast project.

Exposed classes:
Unexposed modules:

See also

Github repository of the torchbeast project.

Exposed Classes

Environment spawner object (environments.EnvSpawner)

class pytorch_seed_rl.environments.EnvSpawner(env_id: str, num_envs: int = 1)[source]

Bases: object

Class that is given to actor threads to spawn local environments by invoking spawn().

An instance of this class exposes spawn().

Parameters
  • env_id (str) – The environments identifier as registered with gym.

  • num_envs (int) – The number of environments spawn() returns.

Variables
  • self.env_info (dict) – Infos about the spawned environments as dictionary.

  • self.placeholder_obs (dict) – A dictionary with the same structure as observations return by the spawned environments step() method.

spawn() → List[gym.Env][source]

Returns a list of wrapped environments (using OpenAI’s gym).

Applies:

Unexposed Submodules

Utility functions for wrapping (environments.atari_wrappers)

A collection of wrappers applicable to environments following the OpenAI gym API

See also

OpenAI Gym

pytorch_seed_rl.environments.atari_wrappers.make_atari(env_id: str) → gym.Env[source]

Creates the Env registered with gym.

Accepts only environments that don’t perform frameskip natively.

Always applies:
Parameters

env_id (str) – The environments identifier as registered with gym.

pytorch_seed_rl.environments.atari_wrappers.wrap_deepmind(env, episode_life: bool = True, clip_rewards: bool = True, frame_stack: bool = False, scale: bool = False) → gym.Env[source]

Configure environment for DeepMind-style Atari.

Always applies:
Parameters
  • env (gym.Env) – An environment that will be wrapped.

  • episode_life (bool) – Applies EpisodicLifeEnv, if True.

  • clip_rewards (bool) – Applies ClipRewardEnv, if True.

  • frame_stack (bool) – Applies FrameStack (k = 4), if True.

  • scale (bool) – Applies ScaledFloatFrame, if True.

pytorch_seed_rl.environments.atari_wrappers.wrap_pytorch(env) → gym.Env[source]

Applies ImageToPyTorch as wrap.

Parameters

env (gym.Env) – An environment that will be wrapped.

class pytorch_seed_rl.environments.atari_wrappers.LazyFrames(frames)[source]

Bases: object

This object ensures that common frames between the observations are only stored once.

It exists purely to optimize memory usage which can be huge for DQN’s 1M frames replay buffers. This object should only be converted to numpy array before being passed to the model. You’d not believe how complex the previous solution was.

Parameters

frames (list) – A list of frames that shall be converted.

Wrappers for OpenAI gym (environments.atari_wrappers)

class pytorch_seed_rl.environments.atari_wrappers.AutoResetWrapper(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

A wrapper that automatically resets the environment in case of termination.

Parameters

env (gym.Env) – An environment that will be wrapped.

class pytorch_seed_rl.environments.atari_wrappers.ClipRewardEnv(*args: Any, **kwargs: Any)[source]

Bases: gym.RewardWrapper

Clips rewards.

Parameters

env (gym.Env) – An environment that will be wrapped.

reward(reward)[source]

Bin reward to {+1, 0, -1} by its sign.

class pytorch_seed_rl.environments.atari_wrappers.DictObservationsEnv(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

Provides observations as dict with additional metrics.

Adds initial() method, which returns the initial observation.

Parameters

env (gym.Env) – An environment that will be wrapped.

initial()dict[source]

Returns an initial observation.

class pytorch_seed_rl.environments.atari_wrappers.EpisodicLifeEnv(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

Make end-of-life == end-of-episode, but only reset on true game over. Done by DeepMind for the DQN and co. since it helps value estimation.

Parameters

env (gym.Env) – An environment that will be wrapped.

reset(**kwargs)[source]

Reset only when lives are exhausted. This way all states are still reachable even though lives are episodic, and the learner need not know about any of this behind-the-scenes.

class pytorch_seed_rl.environments.atari_wrappers.FireResetEnv(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

Take action on reset for environments that are fixed until firing.

Parameters

env (gym.Env) – An environment that will be wrapped.

class pytorch_seed_rl.environments.atari_wrappers.FrameStack(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

Stack k last frames. Returns lazy array, which is much more memory efficient.

See also

LazyFrames

Parameters
  • env (gym.Env) – An environment that will be wrapped.

  • k (int) – Number of last frames to stack.

class pytorch_seed_rl.environments.atari_wrappers.ImageToPyTorch(*args: Any, **kwargs: Any)[source]

Bases: gym.ObservationWrapper

Changes image shape to channels x weight x height

Parameters

env (gym.Env) – An environment that will be wrapped.

class pytorch_seed_rl.environments.atari_wrappers.MaxAndSkipEnv(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

Return only every skip-th frame

Parameters
  • env (gym.Env) – An environment that will be wrapped.

  • skip (int) – The number of the returned frame. If skip = 4 (default), only every 4th frame will be returned.

step(action)[source]

Repeat action, sum reward, and max over last observations.

class pytorch_seed_rl.environments.atari_wrappers.NoopResetEnv(*args: Any, **kwargs: Any)[source]

Bases: gym.Wrapper

Sample initial states by taking random number of no-ops on reset. No-op is assumed to be action 0.

Parameters
  • env (gym.Env) – An environment that will be wrapped.

  • noop_max (int) – The maximum number of no-ops on reset.

reset(**kwargs)[source]

Do no-op action for a number of steps in [1, noop_max].

class pytorch_seed_rl.environments.atari_wrappers.ScaledFloatFrame(*args: Any, **kwargs: Any)[source]

Bases: gym.ObservationWrapper

Normalizes the frame.

Parameters

env (gym.Env) – An environment that will be wrapped.

class pytorch_seed_rl.environments.atari_wrappers.WarpFrame(*args: Any, **kwargs: Any)[source]

Bases: gym.ObservationWrapper

Warp frames to height`x`width as done in the Nature paper and later work. If the environment uses dictionary observations, dict_space_key can be specified which indicates which observation should be warped.

Parameters
  • env (gym.Env) – An environment that will be wrapped.

  • width (int) – Target width of warped frames.

  • height (int) – Target height of warped frames.

  • grayscale (bool) – Set True,. if warped frames shall be greyscale.

  • dict_space_key (str) – Key of targeted space of environments observation space dictionary.