Update app/src/content/chapters/framework-inventory.mdx

#3
by sergiopaniego HF Staff - opened
app/src/content/chapters/framework-inventory.mdx CHANGED
@@ -16,18 +16,18 @@ We surveyed the space and picked these six to build the same environment across
16
 
17
  These are notable RL environment frameworks we evaluated but did not implement. They're excluded because they serve a different purpose or operate at a different level of abstraction.
18
 
19
- | Framework | Creator | Why excluded | Link |
20
- | --- | --- | --- | --- |
21
- | [**Atropos**](https://github.com/NousResearch/atropos) | Nous Research | Different paradigm, environments own inference and POST scored batches to a central API. Not compatible with TRL's turn-by-turn tool calling. | [GitHub](https://github.com/NousResearch/atropos) |
22
- | [**Harbor**](https://github.com/harbor-framework/harbor) | Stanford / Snorkel AI | Offline batch RL only, spins up Docker containers per trial, runs autonomous agents, collects trajectories. No live `environment_factory`. | [GitHub](https://github.com/harbor-framework/harbor) |
23
- | [**RLVE**](https://github.com/Zhiyuan-Zeng/RLVE) | Zhiyuan Zeng | Pure verifier library (445 tasks), `generate() → verify()` with no transport, no tools, no state. Not an environment framework, just problem oracles. | [GitHub](https://github.com/Zhiyuan-Zeng/RLVE) |
24
- | [**Reasoning Gym**](https://github.com/open-thought/reasoning-gym) | Open Thought | Procedural task generators + verifiers, same tier as RLVE. Stateless, no multi-turn, no tools. | [GitHub](https://github.com/open-thought/reasoning-gym) |
25
- | [**RAGEN**](https://github.com/ZihanWang314/RAGEN) | Zihan Wang | Full stack (env + StarPO + veRL), tightly coupled to its own training loop. Gym-compatible but not easily separable for TRL integration. | [GitHub](https://github.com/ZihanWang314/RAGEN) |
26
- | [**rLLM**](https://github.com/agentica-project/rllm) | Agentica | Decorator pattern, wraps existing agent code, intercepts LLM calls. No environment class to subclass. Different paradigm. | [GitHub](https://github.com/agentica-project/rllm) |
27
- | [**RL-Factory**](https://github.com/Simple-Efficient/RL-Factory) | Simple-Efficient | MCP config-based, any MCP server becomes an environment. Interesting but very early stage. | [GitHub](https://github.com/Simple-Efficient/RL-Factory) |
28
- | [**Open-Instruct**](https://github.com/allenai/open-instruct) | Allen AI | Full training framework with env hooks, environments are reward functions, not multi-turn interactive agents. | [GitHub](https://github.com/allenai/open-instruct) |
29
- | [**TextArena**](https://github.com/TextArena/TextArena) | Leon Guertler | Game-specific multi-agent environments, narrow domain, not a general framework. | [GitHub](https://github.com/TextArena/TextArena) |
30
- | [**LlamaGym**](https://github.com/KhoomeiK/LlamaGym) | KhoomeiK | Gymnasium wrapper for LLMs, early prototype, not actively maintained. | [GitHub](https://github.com/KhoomeiK/LlamaGym) |
31
 
32
  ### How these relate
33
 
 
16
 
17
  These are notable RL environment frameworks we evaluated but did not implement. They're excluded because they serve a different purpose or operate at a different level of abstraction.
18
 
19
+ | Framework | Creator | Why excluded |
20
+ | --- | --- | --- |
21
+ | [**Atropos**](https://github.com/NousResearch/atropos) | Nous Research | Different paradigm, environments own inference and POST scored batches to a central API. Not compatible with TRL's turn-by-turn tool calling. |
22
+ | [**Harbor**](https://github.com/harbor-framework/harbor) | Stanford / Snorkel AI | Offline batch RL only, spins up Docker containers per trial, runs autonomous agents, collects trajectories. No live `environment_factory`. |
23
+ | [**RLVE**](https://github.com/Zhiyuan-Zeng/RLVE) | Zhiyuan Zeng | Pure verifier library (445 tasks), `generate() → verify()` with no transport, no tools, no state. Not an environment framework, just problem oracles. |
24
+ | [**Reasoning Gym**](https://github.com/open-thought/reasoning-gym) | Open Thought | Procedural task generators + verifiers, same tier as RLVE. Stateless, no multi-turn, no tools. |
25
+ | [**RAGEN**](https://github.com/ZihanWang314/RAGEN) | Zihan Wang | Full stack (env + StarPO + veRL), tightly coupled to its own training loop. Gym-compatible but not easily separable for TRL integration. |
26
+ | [**rLLM**](https://github.com/agentica-project/rllm) | Agentica | Decorator pattern, wraps existing agent code, intercepts LLM calls. No environment class to subclass. Different paradigm. |
27
+ | [**RL-Factory**](https://github.com/Simple-Efficient/RL-Factory) | Simple-Efficient | MCP config-based, any MCP server becomes an environment. Interesting but very early stage. |
28
+ | [**Open-Instruct**](https://github.com/allenai/open-instruct) | Allen AI | Full training framework with env hooks, environments are reward functions, not multi-turn interactive agents. |
29
+ | [**TextArena**](https://github.com/TextArena/TextArena) | Leon Guertler | Game-specific multi-agent environments, narrow domain, not a general framework. |
30
+ | [**LlamaGym**](https://github.com/KhoomeiK/LlamaGym) | KhoomeiK | Gymnasium wrapper for LLMs, early prototype, not actively maintained. |
31
 
32
  ### How these relate
33