| | --- |
| | license: mit |
| | pipeline_tag: reinforcement-learning |
| | tags: |
| | - Memory Gym |
| | - Proximal Policy Optimization |
| | - Transformer-XL |
| | - CleanRL |
| | --- |
| | |
| | Pre-trained models that work with CleanRL's PPO-TrXL implementation and the environments of [Memory Gym](https://github.com/MarcoMeter/endless-memory-gym). |
| | Usage via `enjoy.py` found [here](https://github.com/MarcoMeter/cleanrl-ppo-trxl/blob/master/cleanrl/ppo_trxl/enjoy.py). |
| |
|
| | Original Paper and Implementation |
| |
|
| | * [Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents](https://arxiv.org/abs/2309.17207) |
| | * [neroRL](https://github.com/MarcoMeter/neroRL), [Episodic Transformer Memory PPO](https://github.com/MarcoMeter/episodic-transformer-memory-ppo) |
| | * [Interactive Visualizations of Trained Agents](https://marcometer.github.io/) |
| |
|
| | Related Publications and Repositories |
| |
|
| | * [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) |
| | * [Stabilizing Transformers for Reinforcement Learning](https://arxiv.org/abs/1910.06764) |
| | * [Towards mental time travel: a hierarchical memory for reinforcement learning agents](https://arxiv.org/abs/2105.14039) |
| | * [Grounded Language Learning Fast and Slow](https://arxiv.org/abs/2009.01719) |
| | * [transformerXL_PPO_JAX](https://github.com/Reytuag/transformerXL_PPO_JAX) |