---
license: mit
pipeline_tag: reinforcement-learning
tags:
- Memory Gym
- Proximal Policy Optimization
- Transformer-XL
- CleanRL
---

Pre-trained models that work with CleanRL's PPO-TrXL implementation and the environments of [Memory Gym](https://github.com/MarcoMeter/endless-memory-gym).
Usage via `enjoy.py` found [here](https://github.com/MarcoMeter/cleanrl-ppo-trxl/blob/master/cleanrl/ppo_trxl/enjoy.py).

Original Paper and Implementation

* [Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents](https://arxiv.org/abs/2309.17207)
* [neroRL](https://github.com/MarcoMeter/neroRL), [Episodic Transformer Memory PPO](https://github.com/MarcoMeter/episodic-transformer-memory-ppo)
* [Interactive Visualizations of Trained Agents](https://marcometer.github.io/)

Related Publications and Repositories

* [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860)
* [Stabilizing Transformers for Reinforcement Learning](https://arxiv.org/abs/1910.06764)
* [Towards mental time travel: a hierarchical memory for reinforcement learning agents](https://arxiv.org/abs/2105.14039)
* [Grounded Language Learning Fast and Slow](https://arxiv.org/abs/2009.01719)
* [transformerXL_PPO_JAX](https://github.com/Reytuag/transformerXL_PPO_JAX)