LimTara's picture
Update README.md
c10a306 verified
|
raw
history blame
4.24 kB
metadata
library_name: stable-baselines3
tags:
  - SpaceInvadersNoFrameskip-v4
  - deep-reinforcement-learning
  - reinforcement-learning
  - stable-baselines3
model-index:
  - name: DQN
    results:
      - task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: SpaceInvadersNoFrameskip-v4
          type: SpaceInvadersNoFrameskip-v4
        metrics:
          - type: mean_reward
            value: 29.00 +/- 64.30
            name: mean_reward
            verified: false

DQN Agent playing SpaceInvadersNoFrameskip-v4 πŸš€

This is a trained model of a DQN agent playing SpaceInvadersNoFrameskip-v4 using the stable-baselines3 library and the RL Zoo.

The RL Zoo is a training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Usage (with SB3 RL Zoo) πŸ“š

RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo
SB3: https://github.com/DLR-RM/stable-baselines3
SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib SBX (SB3 + Jax): https://github.com/araffin/sbx

Install the RL Zoo (with SB3 and SB3-Contrib):

pip install rl_zoo3
# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo dqn --env SpaceInvadersNoFrameskip-v4 -orga LimTara -f logs/
python -m rl_zoo3.enjoy --algo dqn --env SpaceInvadersNoFrameskip-v4  -f logs/

If you installed the RL Zoo3 via pip (pip install rl_zoo3), from anywhere you can do:

python -m rl_zoo3.load_from_hub --algo dqn --env SpaceInvadersNoFrameskip-v4 -orga LimTara -f logs/
python -m rl_zoo3.enjoy --algo dqn --env SpaceInvadersNoFrameskip-v4  -f logs/

Training (with the RL Zoo) πŸ‘Ύ

python -m rl_zoo3.train --algo dqn --env SpaceInvadersNoFrameskip-v4 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo dqn --env SpaceInvadersNoFrameskip-v4 -f logs/ -orga LimTara

Hyperparameters πŸ”—

Hyperparameters are used in order to control the behavior of the training model. In this project, some of the hyperparameters include the number of times the model is simulated (n_timesteps). The higher the number, the more times the model is simulated and the higher the accuracy but the longer it takes to run. ⏰ You can also control how risky it is willing to go in order to explore new things. The higher the risk, the more the model will explore and step out of their comfort zone (exploration_final_eps)! πŸ’₯

To create and edit the hyperparameters, you need to create a file called "dqn.yml" by going to the left bar and click on the file icon which is right below the key icon. Then all you need to do is simply paste the following into the "dqn.yml" file! πŸ“

OrderedDict([('batch_size', 32),
             ('buffer_size', 100000),
             ('env_wrapper',
              ['stable_baselines3.common.atari_wrappers.AtariWrapper']),
             ('exploration_final_eps', 0.01),
             ('exploration_fraction', 0.1),
             ('frame_stack', 4),
             ('gradient_steps', 1),
             ('learning_rate', 0.0001),
             ('learning_starts', 100000),
             ('n_timesteps', 100000),
             ('optimize_memory_usage', False),
             ('policy', 'CnnPolicy'),
             ('target_update_interval', 1000),
             ('train_freq', 4),
             ('normalize', False)])

Environment Arguments

{'render_mode': 'rgb_array'}

Displaying the model visually πŸ‘€

Finally, we now have to display our model so the audience can see what we've done! To do that, first you must install the following libraries and run a virtual screen:

!apt install python-opengl
!apt install xvfb
!pip3 install pyvirtualdisplay
from pyvirtualdisplay import Display

virtual_display = Display(visible=0, size=(1400, 900))
virtual_display.start()

Once you're in your model repository page, go over to "Files and Versions" and create a new folder called "replay.mp4" and simply upload your video and BOOM the virtual display will be right there in your model card! πŸ–₯️