--- title: RLM Training - Needle in Haystack sdk: docker hardware: t4-small --- # RLM Training - Recursive Language Model Skills Training Qwen3-0.6B-Base to find needles in haystacks using GRPO. ## Task - Long context with hidden facts - Model learns to extract specific information - 20 steps quick test ## Based on - RLM Paper (arXiv:2512.24601) - Sebastian Raschka's GRPO insights