--- license: mit pipeline_tag: reinforcement-learning tags: - Memory Gym - Proximal Policy Optimization - Transformer-XL - CleanRL --- Pre-trained models that work with CleanRL's PPO-TrXL implementation and the environments of [Memory Gym](https://github.com/MarcoMeter/endless-memory-gym). Usage via `enjoy.py` found [here](https://github.com/MarcoMeter/cleanrl-ppo-trxl/blob/master/cleanrl/ppo_trxl/enjoy.py). Original Paper and Implementation * [Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents](https://arxiv.org/abs/2309.17207) * [neroRL](https://github.com/MarcoMeter/neroRL), [Episodic Transformer Memory PPO](https://github.com/MarcoMeter/episodic-transformer-memory-ppo) * [Interactive Visualizations of Trained Agents](https://marcometer.github.io/) Related Publications and Repositories * [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) * [Stabilizing Transformers for Reinforcement Learning](https://arxiv.org/abs/1910.06764) * [Towards mental time travel: a hierarchical memory for reinforcement learning agents](https://arxiv.org/abs/2105.14039) * [Grounded Language Learning Fast and Slow](https://arxiv.org/abs/2009.01719) * [transformerXL_PPO_JAX](https://github.com/Reytuag/transformerXL_PPO_JAX)