Finding the Time to Think in Real-Time RL — Checkpoints

Pretrained base planners and gating policies for the paper Finding the Time to Think in Real-Time RL.

A lightweight gating policy on top of a frozen AlphaZero-style MCTS planner selects a state-dependent planning budget at each decision point, across five real-time games (Pac-Man, real-time Tetris, Snake, Speed Hex, Speed Go).

🌐 Project page: https://aneeshers.github.io/realtime-rl/
📄 Paper (PDF): https://aneeshers.github.io/realtime-rl/assets/finding-the-time-to-think.pdf
💻 Code: https://github.com/Aneeshers/realtime-rl-code

Layout

checkpoints/
├── clock/{go,hex}/{base,gating}          # Speed Go / Speed Hex (pgx)
└── committed_action/{pacman,snake,tetris_rt}/{base,gating}   # Jumanji

One AlphaZero base planner + one PPO gating policy per environment. See the code repo's README for the launcher scripts that consume these.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning