Notebooks
This directory contains a collection of Jupyter notebooks that demonstrate how to use the TRL library in different applications.
| Notebook | Description | Open in Colab |
|---|---|---|
grpo_trl_lora_qlora.ipynb |
GRPO using QLoRA on free Colab | |
grpo_agent.ipynb |
GRPO for agent training | Not available due to OOM with Colab GPUs |
grpo_rnj_1_instruct.ipynb |
GRPO rnj-1-instruct with QLoRA using TRL on Colab to add reasoning capabilities | |
sft_ministral3_vl.ipynb |
Supervised Fine-Tuning (SFT) Ministral 3 with QLoRA using TRL on free Colab | |
grpo_ministral3_vl.ipynb |
GRPO Ministral 3 with QLoRA using TRL on free Colab | |
sft_nemotron_3.ipynb |
SFT with LoRA on NVIDIA Nemotron 3 models | |
sft_trl_lora_qlora.ipynb |
Supervised Fine-Tuning (SFT) using QLoRA on free Colab | |
sft_qwen_vl.ipynb |
Supervised Fine-Tuning (SFT) Qwen3-VL with QLoRA using TRL on free Colab | |
sft_tool_calling.ipynb |
Teaching tool calling to a model without native tool-calling support using SFT with QLoRA | |
grpo_qwen3_vl.ipynb |
GRPO Qwen3-VL with QLoRA using TRL on free Colab |
OpenEnv Notebooks
These notebooks demonstrate GRPO training with OpenEnv environments using environment_factory. The BrowserGym notebook uses the lower-level rollout_func API instead.
| Notebook | Description | Open in Colab |
|---|---|---|
openenv_wordle_grpo.ipynb |
GRPO to play Wordle on an OpenEnv environment | |
openenv_sudoku_grpo.ipynb |
GRPO to play Sudoku on an OpenEnv environment | |
grpo_functiongemma_browsergym_openenv.ipynb |
GRPO on FunctionGemma in the BrowserGym environment |