File size: 1,307 Bytes
8a84266 27e703b b0ebb9b 5cbece6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | ---
license: mit
pipeline_tag: reinforcement-learning
tags:
- Memory Gym
- Proximal Policy Optimization
- Transformer-XL
- CleanRL
---
Pre-trained models that work with CleanRL's PPO-TrXL implementation and the environments of [Memory Gym](https://github.com/MarcoMeter/endless-memory-gym).
Usage via `enjoy.py` found [here](https://github.com/MarcoMeter/cleanrl-ppo-trxl/blob/master/cleanrl/ppo_trxl/enjoy.py).
Original Paper and Implementation
* [Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents](https://arxiv.org/abs/2309.17207)
* [neroRL](https://github.com/MarcoMeter/neroRL), [Episodic Transformer Memory PPO](https://github.com/MarcoMeter/episodic-transformer-memory-ppo)
* [Interactive Visualizations of Trained Agents](https://marcometer.github.io/)
Related Publications and Repositories
* [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860)
* [Stabilizing Transformers for Reinforcement Learning](https://arxiv.org/abs/1910.06764)
* [Towards mental time travel: a hierarchical memory for reinforcement learning agents](https://arxiv.org/abs/2105.14039)
* [Grounded Language Learning Fast and Slow](https://arxiv.org/abs/2009.01719)
* [transformerXL_PPO_JAX](https://github.com/Reytuag/transformerXL_PPO_JAX) |