强化学习的 API 标准,包含各种参考环境
体操馆是 OpenAI Gym 库的维护分支。 体操馆接口简单、pythonic,能够表示通用的 RL 问题,并具有 兼容性包装器,适用于旧的 Gym 环境
import gymnasium as gym
# Initialise the environment
env = gym.make("LunarLander-v3", render_mode="human")
# Reset the environment to generate the first observation
observation, info = env.reset(seed=42)
for _ in range(1000):
# this is where you would insert your policy
action = env.action_space.sample()
# step (transition) through the environment with the action
# receiving the next observation, reward and if the episode has terminated or truncated
observation, reward, terminated, truncated, info = env.step(action)
# If the episode has ended then we can reset to start a new episode
if terminated or truncated:
observation, info = env.reset()
env.close()