| --- |
| library_name: stable-baselines3 |
| tags: |
| - ALE/Pacman-v5 |
| - deep-reinforcement-learning |
| - reinforcement-learning |
| - stable-baselines3 |
| model-index: |
| - name: A2C |
| results: |
| - task: |
| type: reinforcement-learning |
| name: reinforcement-learning |
| dataset: |
| name: ALE/Pacman-v5 |
| type: ALE/Pacman-v5 |
| metrics: |
| - type: mean_reward |
| value: 29.90 +/- 10.35 |
| name: mean_reward |
| verified: false |
| --- |
| |
| # **A2C** Agent playing **ALE/Pacman-v5** |
| This is a trained model of a **A2C** agent playing **ALE/Pacman-v5** |
| using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3) |
| and the [RL Zoo](https://github.com/DLR-RM/rl-baselines3-zoo). |
|
|
| The RL Zoo is a training framework for Stable Baselines3 |
| reinforcement learning agents, |
| with hyperparameter optimization and pre-trained agents included. |
|
|
| ## Usage (with SB3 RL Zoo) |
|
|
| RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo<br/> |
| SB3: https://github.com/DLR-RM/stable-baselines3<br/> |
| SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib |
|
|
| Install the RL Zoo (with SB3 and SB3-Contrib): |
| ```bash |
| pip install rl_zoo3 |
| ``` |
|
|
| ``` |
| # Download model and save it into the logs/ folder |
| python -m rl_zoo3.load_from_hub --algo a2c --env ALE/Pacman-v5 -orga alfredo-wh -f logs/ |
| python -m rl_zoo3.enjoy --algo a2c --env ALE/Pacman-v5 -f logs/ |
| ``` |
|
|
| If you installed the RL Zoo3 via pip (`pip install rl_zoo3`), from anywhere you can do: |
| ``` |
| python -m rl_zoo3.load_from_hub --algo a2c --env ALE/Pacman-v5 -orga alfredo-wh -f logs/ |
| python -m rl_zoo3.enjoy --algo a2c --env ALE/Pacman-v5 -f logs/ |
| ``` |
|
|
| ## Training (with the RL Zoo) |
| ``` |
| python -m rl_zoo3.train --algo a2c --env ALE/Pacman-v5 -f logs/ |
| # Upload the model and generate video (when possible) |
| python -m rl_zoo3.push_to_hub --algo a2c --env ALE/Pacman-v5 -f logs/ -orga alfredo-wh |
| ``` |
|
|
| ## Hyperparameters |
| ```python |
| OrderedDict([('env_wrapper', |
| ['stable_baselines3.common.atari_wrappers.AtariWrapper']), |
| ('frame_stack', 4), |
| ('n_envs', 16), |
| ('n_timesteps', 500000.0), |
| ('policy', 'CnnPolicy'), |
| ('policy_kwargs', |
| 'dict(optimizer_class=RMSpropTFLike, ' |
| 'optimizer_kwargs=dict(eps=1e-5))'), |
| ('vf_coef', 0.25), |
| ('normalize', False)]) |
| ``` |
|
|
| # Environment Arguments |
| ```python |
| {'render_mode': 'rgb_array'} |
| ``` |
|
|