Код: Выделить всё
import gym
# Create the CartPole environment
env = gym.make('CartPole-v1')
# Reset the environment to start
state = env.reset()
# Run for 1000 timesteps
for _ in range(1000):
env.render() # Render the environment
action = env.action_space.sample() # Take a random action
state, reward, done, info = env.step(action) # Step the environment by one timestep
# If the episode is done (CartPole has fallen), reset the environment
if done:
state = env.reset()
env.close() # Close the rendering window
Подробнее здесь: https://stackoverflow.com/questions/790 ... openai-gym