kissin42 commited on
Commit
3ae4a43
·
verified ·
1 Parent(s): f5f73c2

Add organization card

Browse files
Files changed (1) hide show
  1. README.md +41 -5
README.md CHANGED
@@ -1,10 +1,46 @@
1
  ---
2
- title: README
3
- emoji: 👀
4
- colorFrom: purple
5
- colorTo: yellow
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Causal GPT-RL
3
+ emoji: 🤖
4
+ colorFrom: indigo
5
+ colorTo: green
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
+ # Causal GPT-RL
11
+
12
+ GPT-style transformers (GPT-2, Llama) running as RL policies in continuous-control environments.
13
+
14
+ ```text
15
+ action → next state → next action (RL rollouts)
16
+ token → next token → next token (LLM generation)
17
+ ```
18
+
19
+ Stable under self-generated rollouts — long-horizon control without the drift that has historically kept transformers from being usable as RL agents.
20
+
21
+ ## Get started
22
+
23
+ ```bash
24
+ pip install "causal-gpt-rl[hub,mujoco]"
25
+ ```
26
+
27
+ ```python
28
+ import gymnasium as gym
29
+ from causal_gpt_rl.inference import load_runner_from_hub, run_episodes
30
+
31
+ env = gym.make("Ant-v5")
32
+ runner = load_runner_from_hub(
33
+ repo_id="ccnets/causal-gpt-rl",
34
+ subfolder="ant-v5",
35
+ device="cpu",
36
+ )
37
+ stats = run_episodes(env, runner, num_episodes=5, seed=0)
38
+ ```
39
+
40
+ **Available bundles:** Ant-v5, HalfCheetah-v5, Walker2d-v5, Humanoid-v5
41
+
42
+ - **Code:** [github.com/ccnets-team/causal-gpt-rl](https://github.com/ccnets-team/causal-gpt-rl)
43
+ - **Training logs (W&B, public):** [wandb.ai/junhopark/Causal GPT-RL](https://wandb.ai/junhopark/Causal%20GPT-RL?nw)
44
+ - **Website:** [ccnets.org](https://ccnets.org)
45
+
46
+ Released under PolyForm Noncommercial 1.0.0.