---
title: RLM Training - Needle in Haystack
sdk: docker
hardware: t4-small
---

# RLM Training - Recursive Language Model Skills

Training Qwen3-0.6B-Base to find needles in haystacks using GRPO.

## Task
- Long context with hidden facts
- Model learns to extract specific information
- 20 steps quick test

## Based on
- RLM Paper (arXiv:2512.24601)
- Sebastian Raschka's GRPO insights