Buckets:

hf-doc-build
/

doc-dev

26 days ago

4.49 kB

Tutorials

The Getting Started Series walks you from zero to deploying your own environment in five short parts. No GPU required.

Part	What it covers	Notebook
1 — Introduction & Quick Start	What OpenEnv is, why it exists, and your first environment in under 10 minutes
2 — Using Environments	Connect to environments, create policies, run evaluations
3 — Building Environments	Create a custom environment from scratch
4 — Packaging & Deploying	Package with Docker and deploy to Hugging Face	—
5 — Contributing to Hugging Face	Publish, fork, and share environments on the Hub	—

Already familiar with the basics? These tutorials cover specific workflows in depth.

Tutorial	What it covers	GPU	Notebook
OpenEnv Tutorial	Full introduction to OpenEnv: install, connect to a hosted environment, step through an episode, define a reward function, and run a basic training loop.	No
End-to-end walkthrough	The full pipeline: connect to `reasoning_gym`, wire it into TRL via `environment_factory`, fine-tune with GRPO, and push the checkpoint to the Hub.	Yes
Building and using MCP environments	Consume and build MCP-backed environments: list and call tools through `step()`, register Python functions as tools with FastMCP.	No
Rubrics	Compose reward functions from reusable pieces using `Gate`, `WeightedSum`, `LLMJudge`, and `TrajectoryRubric`.	No
Wordle GRPO	Train an agent to play Wordle using GRPO via TRL's `environment_factory`.	Yes
RL Training with 2048	Train a language model to play 2048 using GRPO. Covers game-state representation and reward shaping.	Yes	—
Evaluating agents with Inspect AI	Wrap an OpenEnv environment in an Inspect AI `Task`, run it via `InspectAIHarness`, and get a structured `EvalResult`.	No
BrowserGym Harness Rollouts	Drive BrowserGym through the OpenEnv harness runtime when a trainer needs token sampling, logprobs, and reward assignment inside the training loop.	Yes	—
Collecting rollouts for supervised training	Run a teacher model to collect reward-labeled rollouts, filter them, and fine-tune a student with TRL's `SFTTrainer` as a warm-start for GRPO.	Yes

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.