ccnets
/

causal-gpt-rl

@@ -14,7 +14,7 @@ tags:
 GPT-style transformers (GPT-2, Llama) running as RL policies in continuous-control environments.
-The autoregressive structure is the same on both sides:
 ```text
 token           → next token                           (LLM generation)

 GPT-style transformers (GPT-2, Llama) running as RL policies in continuous-control environments.
+Both LLM generation and RL interaction are autoregressive:
 ```text
 token           → next token                           (LLM generation)