Buckets:
Tutorials
New to OpenEnv? Start Here
The Getting Started Series walks you from zero to deploying your own environment in five short parts. No GPU required.
| Part | What it covers | Notebook |
|---|---|---|
| 1 — Introduction & Quick Start | What OpenEnv is, why it exists, and your first environment in under 10 minutes | |
| 2 — Using Environments | Connect to environments, create policies, run evaluations | |
| 3 — Building Environments | Create a custom environment from scratch | |
| 4 — Packaging & Deploying | Package with Docker and deploy to Hugging Face | — |
| 5 — Contributing to Hugging Face | Publish, fork, and share environments on the Hub | — |
Topic Tutorials
Already familiar with the basics? These tutorials cover specific workflows in depth.
| Tutorial | What it covers | GPU | Notebook |
|---|---|---|---|
| OpenEnv Tutorial | Full introduction to OpenEnv: install, connect to a hosted environment, step through an episode, define a reward function, and run a basic training loop. | No | |
| End-to-end walkthrough | The full pipeline: connect to reasoning_gym, wire it into TRL via environment_factory, fine-tune with GRPO, and push the checkpoint to the Hub. |
Yes | |
| Building and using MCP environments | Consume and build MCP-backed environments: list and call tools through step(), register Python functions as tools with FastMCP. |
No | |
| Rubrics | Compose reward functions from reusable pieces using Gate, WeightedSum, LLMJudge, and TrajectoryRubric. |
No | |
| Wordle GRPO | Train an agent to play Wordle using GRPO via TRL's environment_factory. |
Yes | |
| RL Training with 2048 | Train a language model to play 2048 using GRPO. Covers game-state representation and reward shaping. | Yes | — |
| Evaluating agents with Inspect AI | Wrap an OpenEnv environment in an Inspect AI Task, run it via InspectAIHarness, and get a structured EvalResult. |
No | |
| BrowserGym Harness Rollouts | Drive BrowserGym through the OpenEnv harness runtime when a trainer needs token sampling, logprobs, and reward assignment inside the training loop. | Yes | — |
| Collecting rollouts for supervised training | Run a teacher model to collect reward-labeled rollouts, filter them, and fine-tune a student with TRL's SFTTrainer as a warm-start for GRPO. |
Yes |
Xet Storage Details
- Size:
- 4.49 kB
- Xet hash:
- e4628908fe3f71388d8b4ec7c2a46dd19150169903f5977ce1e55fd5565e3a98
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.