File size: 1,253 Bytes
f5f73c2 3ae4a43 f5f73c2 3ae4a43 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | ---
title: Causal GPT-RL
emoji: 🤖
colorFrom: indigo
colorTo: green
sdk: static
pinned: false
---
# Causal GPT-RL
GPT-style transformers (GPT-2, Llama) running as RL policies in continuous-control environments.
```text
action → next state → next action (RL rollouts)
token → next token → next token (LLM generation)
```
Stable under self-generated rollouts — long-horizon control without the drift that has historically kept transformers from being usable as RL agents.
## Get started
```bash
pip install "causal-gpt-rl[hub,mujoco]"
```
```python
import gymnasium as gym
from causal_gpt_rl.inference import load_runner_from_hub, run_episodes
env = gym.make("Ant-v5")
runner = load_runner_from_hub(
repo_id="ccnets/causal-gpt-rl",
subfolder="ant-v5",
device="cpu",
)
stats = run_episodes(env, runner, num_episodes=5, seed=0)
```
**Available bundles:** Ant-v5, HalfCheetah-v5, Walker2d-v5, Humanoid-v5
- **Code:** [github.com/ccnets-team/causal-gpt-rl](https://github.com/ccnets-team/causal-gpt-rl)
- **Training logs (W&B, public):** [wandb.ai/junhopark/Causal GPT-RL](https://wandb.ai/junhopark/Causal%20GPT-RL?nw)
- **Website:** [ccnets.org](https://ccnets.org)
Released under PolyForm Noncommercial 1.0.0. |