LilHairdy
/

cleanrl_memory_gym

Reinforcement Learning

Proximal Policy Optimization

Model card Files Files and versions

cleanrl_memory_gym / README.md

LilHairdy's picture

Update README.md

b0ebb9b verified over 1 year ago

|

history blame contribute delete

1.31 kB

	---
	license: mit
	pipeline_tag: reinforcement-learning
	tags:
	- Memory Gym
	- Proximal Policy Optimization
	- Transformer-XL
	- CleanRL
	---

	Pre-trained models that work with CleanRL's PPO-TrXL implementation and the environments of [Memory Gym](https://github.com/MarcoMeter/endless-memory-gym).
	Usage via `enjoy.py` found [here](https://github.com/MarcoMeter/cleanrl-ppo-trxl/blob/master/cleanrl/ppo_trxl/enjoy.py).

	Original Paper and Implementation

	* [Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents](https://arxiv.org/abs/2309.17207)
	* [neroRL](https://github.com/MarcoMeter/neroRL), [Episodic Transformer Memory PPO](https://github.com/MarcoMeter/episodic-transformer-memory-ppo)
	* [Interactive Visualizations of Trained Agents](https://marcometer.github.io/)

	Related Publications and Repositories

	* [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860)
	* [Stabilizing Transformers for Reinforcement Learning](https://arxiv.org/abs/1910.06764)
	* [Towards mental time travel: a hierarchical memory for reinforcement learning agents](https://arxiv.org/abs/2105.14039)
	* [Grounded Language Learning Fast and Slow](https://arxiv.org/abs/2009.01719)
	* [transformerXL_PPO_JAX](https://github.com/Reytuag/transformerXL_PPO_JAX)